Supporting QueryAnalyzers

Coordinator
Oct 26, 2007 at 10:06 PM
The DocumentAttribute class provides information specific to creating an index from the class. The biggest current downfall of this attribute is that it currently only supports the StandardAnalayer class. Since we can not pass anything but contants, enums or types to an attribute, I currently am not sure of the best way for a class to specify the appropriate analyzer. I'm also not sure that the analyzer can even be inferred as globally as the class level, so some thoughts here are welcome.
Coordinator
Nov 7, 2007 at 5:55 PM
What's wrong with specifying a AnalyzerType in DocAttr? Document(typeof(MyAnalyzer)

If the chosen Analyzer (like StandardAnalyzer) requires construction args (like stopwords for StandAnalzyer) you could encourage the derivation of new concrete parameterless constructor analyzer types.

This is an acceptable solution in my opinion, because one rarely wants to rely on the default analyzer to tokenize strings for ALL fields. Lucene users typically use the PerFieldAnalyzerWrapper. You could provide a BasePerFieldAnalyzer that extended PerFieldAnalyzerWrapper with protected dictionary for specifying fields and their specific analyzers.


Locksley wrote:
The DocumentAttribute class provides information specific to creating an index from the class. The biggest current downfall of this attribute is that it currently only supports the StandardAnalayer class. Since we can not pass anything but contants, enums or types to an attribute, I currently am not sure of the best way for a class to specify the appropriate analyzer. I'm also not sure that the analyzer can even be inferred as globally as the class level, so some thoughts here are welcome.


Coordinator
Nov 7, 2007 at 8:26 PM
I like that suggestion, because it does provide support for each field to determine its own analyzer, while setting the default analyzer at a more global level. This implies that the type of anlyzer used per-field would be made as a part of the FieldAttribute. Perhaps the best way to do all of this would be to create Analazyer-Attributes that could be decorated onto the class or the property, where each analyzers property could be set:
[StandardAnalyzer(...)]
[Document]
[Table]
public class City
{
  [Column(...)]
  [StopAnalyzer(...)]
  [Field(...)]
  [TypeConverter(...)]
  public int Longitude { get; set; }
}

Just a thought. (is that too much decorating?)