Apache Solr 3 Enterprise Search Server

Enhance Your Search With Faceted Navigation Result Highlighting Relevancy Ranked Sorting And More

My Reading Lists:

Create a new list


Buy this book

Last edited by Drini
October 25, 2025 | History

Apache Solr 3 Enterprise Search Server

Enhance Your Search With Faceted Navigation Result Highlighting Relevancy Ranked Sorting And More

This work doesn't have a description yet. Can you add one?

Publish Date
Publisher
Packt Publishing
Language
English
Pages
393

Buy this book

Previews available in: English

Book Details


Table of Contents

Preface
Page ix
Chapter 1. Quick Starting Solr
Page 7
An introduction to Solr
Page 7
Lucene, the underlying engine
Page 8
Solr, a Lucene-based search server
Page 9
Comparison to database technology
Page 10
Getting started
Page 11
Solr's installation directory structure
Page 12
Solr's home directory and Solr cores
Page 14
Running Solr
Page 15
A quick tour of Solr
Page 16
Loading sample data
Page 18
A simple query
Page 20
Some statistics
Page 23
The sample browse interface
Page 24
Configuration files
Page 25
Resources outside this book
Page 27
Summary
Page 28
Chapter 2. Schema and Text Analysis
Page 29
MusicBrainz.org
Page 30
One combined index or separate indices
Page 31
One combined index
Page 32
Problems with using a single combined index
Page 33
Separate indices
Page 34
Schema design
Page 35
Step 1: Determine which searches are going to be powered by Solr
Page 36
Step 2: Determine the entities returned from each search
Page 36
Step 3: Denormalize related data
Page 37
Denormalizing — 'one-to-one' associated data
Page 37
Denormalizing — 'one-to-many' associated data
Page 38
Step 4: (Optional) Omit the inclusion of fields only used in search results
Page 39
The schema.xml file
Page 40
Defining field types
Page 41
Built-in field type classes
Page 42
Numbers and dates
Page 42
Geospatial
Page 43
Field options
Page 43
Field definitions
Page 44
Dynamic field definitions
Page 45
Our MusicBrainz field definitions
Page 46
Copying fields
Page 48
The unique key
Page 49
The default search field and query operator
Page 49
Text analysis
Page 50
Configuration
Page 51
Experimenting with text analysis
Page 54
Character filters
Page 55
Tokenization
Page 57
WordDelimiterFilter
Page 59
Stemming
Page 61
Correcting and augmenting stemming
Page 62
Synonyms
Page 63
Index-time versus query-time, and to expand or not
Page 64
Stop words
Page 65
Phonetic sounds-like analysis
Page 66
Substring indexing and wildcards
Page 67
ReversedWildcardFilter
Page 68
N-grams
Page 69
N-gram costs
Page 70
Sorting Text
Page 71
Miscellaneous token filters
Page 72
Summary
Page 73
Chapter 3. Indexing Data
Page 75
Communicating with Solr
Page 76
Direct HTTP or a convenient client API
Page 76
Push data to Solr or have Solr pull it
Page 76
Data formats
Page 76
HTTP POSTing options to Solr
Page 77
Remote streaming
Page 79
Solr's Update-XML format
Page 80
Deleting documents
Page 81
Commit, optimize, and rollback
Page 82
Sending CSV formatted data to Solr
Page 84
Configuration options
Page 86
The Data Import Handler Framework
Page 87
Setup
Page 88
The development console
Page 89
Writing a DIH configuration file
Page 90
Data Sources
Page 90
Entity processors
Page 91
Fields and transformers
Page 92
Example DIH configurations
Page 94
Importing from databases
Page 94
Importing XML from a file with XSLT
Page 96
Importing multiple rich document files (crawling)
Page 97
Importing commands
Page 98
Delta imports
Page 99
Indexing documents with Solr Cell
Page 100
Extracting text and metadata from files
Page 100
Configuring Solr
Page 101
Solr Cell parameters
Page 102
Extracting karaoke lyrics
Page 104
Indexing richer documents
Page 106
Update request processors
Page 109
Summary
Page 110
Chapter 4. Searching
Page 111
Your first search, a walk-through
Page 112
Solr's generic XML structured data representation
Page 114
Solr's XML response format
Page 115
Parsing the URL
Page 116
Request handlers
Page 117
Query parameters
Page 119
Search criteria related parameters
Page 119
Result pagination related parameters
Page 120
Output related parameters
Page 121
Diagnostic related parameters
Page 121
Query parsers and local-params
Page 122
Query syntax (the lucene query parser)
Page 123
Matching all the documents
Page 125
Mandatory, prohibited, and optional clauses
Page 125
Boolean operators
Page 126
Sub-queries
Page 127
Limitations of prohibited clauses in sub-queries
Page 128
Field qualifier
Page 128
Phrase queries and term proximity
Page 129
Wildcard queries
Page 129
Fuzzy queries
Page 131
Range queries
Page 131
Date math
Page 132
Score boosting
Page 133
Existence (and non-existence) queries
Page 134
Escaping special characters
Page 134
The Dismax query parser (part 1)
Page 135
Searching multiple fields
Page 137
Limited query syntax
Page 137
Min-should-match
Page 138
Basic rules
Page 138
Multiple rules
Page 139
What to choose
Page 140
A default search
Page 140
Filtering
Page 141
Sorting
Page 142
Geospatial search
Page 143
Indexing locations
Page 143
Filtering by distance
Page 144
Sorting by distance
Page 145
Summary
Page 146
Chapter 5. Search Relevancy
Page 147
Scoring
Page 148
Query-time and index-time boosting
Page 149
Troubleshooting queries and scoring
Page 149
Dismax query parser (part 2)
Page 151
Lucene's DisjunctionMaxQuery
Page 152
Boosting: Automatic phrase boosting
Page 153
Configuring automatic phrase boosting
Page 153
Phrase slop configuration
Page 154
Partial phrase boosting
Page 154
Boosting: Boost queries
Page 155
Boosting: Boost functions
Page 156
Add or multiply boosts?
Page 157
Function queries
Page 158
Field references
Page 159
Function reference
Page 160
Mathematical primitives
Page 161
Other math
Page 161
ord and rord
Page 162
Miscellaneous functions
Page 162
Function query boosting
Page 164
Formula: Logarithm
Page 164
Formula: Inverse reciprocal
Page 165
Formula: Reciprocal
Page 167
Formula: Linear
Page 168
How to boost based on an increasing numeric field
Page 168
Step by step...
Page 169
External field values
Page 170
How to boost based on recent dates
Page 170
Step by step...
Page 170
Summary
Page 171
Chapter 6. Faceting
Page 173
A quick example: Faceting release types
Page 174
MusicBrainz schema changes
Page 176
Field requirements
Page 178
Types of faceting
Page 178
Faceting field values
Page 179
Alphabetic range bucketing
Page 181
Faceting numeric and date ranges
Page 182
Range facet parameters
Page 185
Facet queries
Page 187
Building a filter query from a facet
Page 188
Field value filter queries
Page 189
Facet range filter queries
Page 189
Excluding filters (multi-select faceting)
Page 190
Hierarchical faceting
Page 194
Summary
Page 196
Chapter 7. Search Components
Page 197
About components
Page 198
The Highlight component
Page 200
A highlighting example
Page 200
Highlighting configuration
Page 202
The regex fragmenter
Page 205
The fast vector highlighter with multi-colored highlighting
Page 205
The SpellCheck component
Page 207
Schema configuration
Page 208
Configuration in solrconfig.xml
Page 209
Configuring spellcheckers (dictionaries)
Page 211
Processing of the q parameter
Page 213
Processing of the spellcheck.q parameter
Page 213
Building the dictionary from its source
Page 214
Issuing spellcheck requests
Page 215
Example usage for a misspelled query
Page 217
Query complete / suggest
Page 219
Query term completion via facet.prefix
Page 221
Query term completion via the Suggester
Page 223
Query term completion via the Terms component
Page 226
The QueryElevation component
Page 227
Configuration
Page 228
The MoreLikeThis component
Page 230
Configuration parameters
Page 231
Parameters specific to the MLT search component
Page 231
Parameters specific to the MLT request handler
Page 231
Common MLT parameters
Page 232
MLT results example
Page 234
The Stats component
Page 236
Configuring the stats component
Page 237
Statistics on track durations
Page 237
The Clustering component
Page 238
Result grouping / Field collapsing
Page 239
Configuring result grouping
Page 241
The TermVector component
Page 243
Summary
Page 243
Chapter 8. Deployment
Page 245
Deployment methodology for Solr
Page 245
Questions to ask
Page 246
Installing Solr into a Servlet container
Page 247
Differences between Servlet containers
Page 248
Defining solr.home property
Page 248
Logging
Page 249
HTTP server request access logs
Page 250
Solr application logging
Page 251
Configuring logging output
Page 252
Logging using Log4j
Page 253
Jetty startup integration
Page 253
Managing log levels at runtime
Page 254
A SearchHandler per search interface?
Page 254
Leveraging Solr cores
Page 256
Configuring solr.xml
Page 256
Property substitution
Page 258
Include fragments of XML with XInclude
Page 259
Managing cores
Page 259
Why use multicore?
Page 261
Monitoring Solr performance
Page 262
Stats.jsp
Page 263
JMX
Page 264
Starting Solr with JMX
Page 265
Securing Solr from prying eyes
Page 270
Limiting server access
Page 270
Securing public searches
Page 272
Controlling JMX access
Page 273
Securing index data
Page 273
Controlling document access
Page 273
Other things to look at
Page 274
Summary
Page 275
Chapter 9. Integrating Solr
Page 277
Working with included examples
Page 278
Inventory of examples
Page 278
Solritas, the integrated search UI
Page 279
Pros and Cons of Solritas
Page 281
SolrJ: Simple Java interface
Page 283
Using Heritrix to download artist pages
Page 283
SolrJ-based client for Indexing HTML
Page 285
SolrJ client API
Page 287
Embedding Solr
Page 288
Searching with SolrJ
Page 289
Indexing
Page 290
When should I use embedded Solr?
Page 294
In-process indexing
Page 294
Standalone desktop applications
Page 295
Upgrading from legacy Lucene
Page 295
Using JavaScript with Solr
Page 296
Wait, what about security?
Page 297
Building a Solr powered artists autocomplete widget with jQuery and JSONP
Page 298
AJAX Solr
Page 303
Using XSLT to expose Solr via OpenSearch
Page 305
OpenSearch based Browse plugin
Page 306
Installing the Search MBArtists plugin
Page 306
Accessing Solr from PHP applications
Page 309
solr-php-client
Page 310
Drupal options
Page 311
Apache Solr Search integration module
Page 312
Hosted Solr by Acquia
Page 312
Ruby on Rails integrations
Page 313
The Ruby query response writer
Page 313
sunspot_rails gem
Page 314
Setting up MyFaves project
Page 315
Populating MyFaves relational database from Solr
Page 316
Build Solr indexes from a relational database
Page 318
Complete MyFaves website
Page 320
Which Rails/Ruby library should I use?
Page 322
Nutch for crawling web pages
Page 323
Maintaining document security with ManifoldCF
Page 324
Connectors
Page 325
Putting ManifoldCF to use
Page 325
Summary
Page 328
Chapter 10. Scaling Solr
Page 329
Tuning complex systems
Page 330
Testing Solr performance with SolrMeter
Page 332
Optimizing a single Solr server (Scale up)
Page 334
Configuring JVM settings to improve memory usage
Page 334
MMapDirectoryFactory to leverage additional virtual memory
Page 335
Enabling downstream HTTP caching
Page 335
Solr caching
Page 338
Tuning caches
Page 339
Indexing performance
Page 340
Designing the schema
Page 340
Sending data to Solr in bulk
Page 341
Don't overlap commits
Page 342
Disabling unique key checking
Page 343
Index optimization factors
Page 343
Enhancing faceting performance
Page 345
Using term vectors
Page 345
Improving phrase search performance
Page 346
Moving to multiple Solr servers (Scale horizontally)
Page 348
Replication
Page 349
Starting multiple Solr servers
Page 349
Configuring replication
Page 351
Load balancing searches across slaves
Page 352
Indexing into the master server
Page 352
Configuring slaves
Page 353
Configuring load balancing
Page 354
Sharding indexes
Page 356
Assigning documents to shards
Page 357
Searching across shards (distributed search)
Page 358
Combining replication and sharding (Scale deep)
Page 360
Near real time search
Page 362
Where next for scaling Solr?
Page 363
Summary
Page 364
Appendix: Search Quick Reference
Page 365
Quick reference
Page 366
Index
Page 369

Classifications

Library of Congress
QA76.76.A65 S65 2011eb

Edition Identifiers

Open Library
OL26053500M
Internet Archive
apachesolr3enter0000smil
ISBN 13
9781849516068
OCLC/WorldCat
785645798, 774288834

Work Identifiers

Work ID
OL17468014W

Community Reviews (0)

No community reviews have been submitted for this work.

Lists

History

Download catalog record: RDF / JSON
October 25, 2025 Edited by Drini Edited without comment.
October 8, 2023 Edited by raybb Bulk tagging works
October 8, 2023 Edited by raybb Edited without comment.
December 13, 2022 Edited by MARC Bot import existing book
October 14, 2016 Created by Mek Added new book.