Linq to lucene in distributed enviremoent

Mar 19, 2009 at 3:39 PM
How can I use linq to lucene in a distributed eneviroment.I would love to get the data from the database using  linq to lucene and store it in memory.What I don't know is how do I query the stored data in memory using  linq to lucene in a distributed enviroment.
Any idea would be appreciated.Sample code or not, I would love to know any idea technically on how to achieve this.
Thanks in advance.
Delano
Coordinator
Mar 21, 2009 at 3:05 PM
"How can I use linq to lucene in a distributed eneviroment"

There are several ways. And each way depends on how your software is desired to scale and how your system is architected.

"What I don't know is how do I query the stored data in memory using  linq to lucene in a distributed enviroment."

There are two ways I can see that would work
  1. Store the index in a distributed storage system (like Velocity, Memcached, etc). Then use standard queries. I haven't tried this but would be surprised if it worked well
  2. This is what's worked for me. Create a WCF search service that resides in your system and pretend it's in the DB tier or the application tier. The clients of the search service would be the web servers and anything else that would need to search. This approach allows you to hide (aka encapsulate) the details of how the search engine operates. The Search service could be as simple as
[WcfService] // pls note, i'm making this attribute up...i can't recall exactly how to do this...but you get the picture
public class SearchService {
  public IList<MyDocument> Search(MyQueryObject qobj, PageDetails paging);
}

This raises the question about where the LINQ query resides. Does the LINQ query reside on the search-server or the search-client?  For now, the LINQ query itself must reside on the search-server side, because there is a tight coupling between the QueryProvider and the IndexContext (the thing that holds a reference to the lucene index).
This means you can't put LINQ on your search-client side, and your SearchService interface will need to offer it's own query semantics. Then, on the server-side of the WCF service, a custom translation between MYQueryObject to a LINQ query would need to occur.

It should be possible to decouple the provider from the indexContext such that linq to lucene queries are traversed on the client. But this would take some work. Thanks for pointing this scenario out, because this would make this project more like LINQ to SQL if we decoupled the traversal (the conversion from LINQ to underlying lucene query) from the execution (the actual lucene search) and projection (the select expression).

For now, you're stuck with writing your own translator to talk to the search-server. But, this isn't much work at all if your query language is simple. Most query languages are very simple.

Hope that helps,
CV


Mar 21, 2009 at 9:37 PM
Hi 
Thanks so much for your post.
Actualy, what I want is a way to query my database entity using linq to lucene(Which is what I did with linq2lucene. Thanks for the tool)
After that,I have the server that store the index to split them based on configuration(No of splits eg,3).How can I achieve the spliting 
of the index.

I would then want to send each split to a different system (I can send them if it is splited)

Then, I would love to know how to query any of the split index.

e.g. Querying Index1 using linqtolucene.
I gueue the index is an indexcontext here.How can I query the index context and how can I split it into more index context.
and search on any of them.

With this, I hope the distrinuted enviroment will be possible.


Please any suggestion will be appreciated.
Thanks in advance
Delano.

Coordinator
Apr 12, 2009 at 6:08 AM
"After that,I have the server that store the index to split them based on configuration(No of splits eg,3).How can I achieve the spliting
of the index."

Lucene (AFAIK) supports query search result weaving from multiple indexes, but expects the developer will actually split up the index themselves. 

Distribution of search if a tough problem, and depends on your index size, query perf etc. To find a good approach with Lucene, I suggest you ask on the Lucene mailing list about how to do this. 

LINQ to Lucene may not be able to help you for this problem, but of course depends on what indexing/splitting/searching/combination strategy is.

Ask on the mailing list, and please keep us informed about what solution works for you. Maybe we can add a way to accomplish this into LINQ to Lucene.





Coordinator
Apr 13, 2009 at 7:14 AM
Look at the Distributed Search project in Lucene.Net Contrib. You can look get the source from the Lucene.Net trunk, http://incubator.apache.org/lucene.net/