Scaling Apache Solr

Scaling Apache Solr

Scaling Apache Solr

Optimize your searches using high-performance enterprise search repositories with Apache Solr

Overview

  • Get an introduction to the basics of Apache Solr in a step-by-step manner with lots of examples
  • Develop and understand the workings of enterprise search solution using various techniques and real-life use cases
  • Gain a practical insight into the advanced ways of optimizing and making an enterprise search solution cloud ready

In Detail

This book is for individuals who want to build high-performance, scalable, enterprise-ready search engines for their customers/organizations. The book starts with the basics of Apache Solr, covering different ways to analyze enterprise information and design enterprise-ready search engines using Solr. It also discusses scaling Solr-based enterprise search for the next level.

Each chapter takes you through more advanced levels of Apache Solr with real-world practical details such as configuring instances, installing and setting up instances, and more. This book contains detailed explanations of the basic and advanced features of Apache Solr.

By sequentially working through the steps in each chapter and with the help of real-life industry examples, you will quickly master the features of Apache Solr to build search solutions for enterprises.

What you will learn from this book

  • Gain a complete understanding of Apache Solr and its ecosystem
  • Develop scalable, high-performance search applications using Apache Solr
  • Customize Apache-Solr-based search for different requirements
  • Discover different techniques to build high-speed enterprise searches
  • Design enterprise-ready search engines and implement a scalable enterprise search functionality
  • Integrate an Apache-Solr-based search with different subsystems and legacy systems
  • Scale Apache Solr through sharding, replication, and fault tolerance
  • Learn about performance tuning for your Solr-based application while scaling your data
  • Make your enterprise search cloud-ready to be able to work with multiple clients

Approach

This book is a step-by-step guide for readers who would like to learn how to build complete enterprise search solutions, with ample real-world examples and case studies.

Who this book is written for

If you are a developer, designer, or architect who would like to build enterprise search solutions for your customers or organization, but have no prior knowledge of Apache Solr/Lucene technologies, this is the book for you.

List Price: $ 43.99

Price: [wpramaprice asin=”1783981741″]

[wpramareviews asin=”1783981741″]

2 comments

  1. 4.0 out of 5 stars
    has written an extremely useful guide to one of the most popular open-source search …, October 13, 2014
    By 

    This review is from: Scaling Apache Solr (Paperback)
    We live in a world flooded by data and information and all realize that if we can’t find what we’re looking for (e.g. a specific document), there’s no benefit from all these data stores. When your data sets become enormous or your systems need to process thousands of messages a second, you need to an environment that is efficient, tunable and ready for scaling. We all need well-designed search technology.

    A few days ago, a book called “Scaling Apache Solr” landed on my desk. The author, Hrishikesh Vijay Karambelkar, has written an extremely useful guide to one of the most popular open-source search platforms, Apache Solr. Solr is a full-text, standalone, Java search engine based on Lucene, another successful Apache project. For people working with Solr, like myself, this book should be on their Christmas shopping list! It’s one of the best on this subject.

    Karambelkar is an enterprise architect with a long history in both commercial products and open source technology. As he says, he currently spends most of his time solving problems for the software industry and developing the next generation of products.

    The book is divided into 10 chapters. Basically, the first three are an introduction to Apache Solr and cover its architecture, features, configuration and setting up. Chapter One contains many practical cases of Apache Solr, to help beginners understand the topic.
    Chapter Four is very interesting and describes a common pattern for enterprise search solutions. These patterns focus on data processing/integration and how to meet the requirements of users (interface, relevancy, general experience).
    The rest of the book mainly refers to the central topic, that is distributing search queries and how to scale/optimize a system. The book discusses all Apache Solr concepts like replication, fault tolerance, sharding and illustrates them with helpful examples. The book precisely explains SolrCloud – a bundle of built-in distributed capabilities available from version 4.0.
    Chapter 8, dedicated to optimization, drew my attention. It is full of useful tips concerning JVM parameters and manipulating data structures or caching layers as well.

    “Scaling Apache Solr” covers both basic and advanced subjects. The information is well organised, clear and concise. Lots of examples and cases in this book can be absorbed by beginners. I was nicely surprised by the chapter describing integration possibilities. There’s some great information about using Solr with Cassandra, MapReduce paradigm or R (programming language for computational statistics) although I would have preferred this subject to be covered in more detail. The book has two more advantages: first, it discusses designing an enterprise search system in general terms and second, it can be treated as an introduction to large volume data processing.
    I believe I need to emphasize that many sections related to defining a schema, importing data, running SolrCloud or searching in near real time (NRT) are not just a raw documentation, they also have the author’s well-judged advice and comments.
    Unfortunately, I felt some of the more advanced topics were not described in enough detail. For example, index merging, documents relevance or using dynamic fields in data structure. Moreover, reading the book, I had a feeling that some parts do not fit the title, such as the section about clustering with Carrot2 or integration with PHP web portal.

    In summary, I can say that I have read this book with pleasure and satisfaction, which in fact is rare regarding technology publications! For me, as a person who has been working with Solr since version 1.3, it was a great way to review and sort out some of its aspects. On the other hand, I’m pretty sure, that people starting their experience with Apache Solr will take a lot from this book. Although, it is mainly focused on advanced problems, it starts with the basics.
    Despite some little imperfections I can truly recommend this book, especially because it describes the concrete technology in an easy-to-read way and also refers to some general architectural patterns.

    0

    Help other customers find the most helpful reviews 

    Was this review helpful to you? Yes
    No

  2. 5.0 out of 5 stars
    … Solr by Hrishikesh Karambelkar turned out to be a great surprise. Having my expectation set initially low (without …, October 3, 2014
    By 
    A. Zubarev (Toronto, ON Canada) –
    (REAL NAME)
      

    Reading Scaling Apache Solr by Hrishikesh Karambelkar turned out to be a great surprise. Having my expectation set initially low (without an apparent reason) it suddenly unfolded into something huge I could regret having otherwise passed by. It is actually an extremely thoughtful and full of practical examples … I do not know how to call it, whether a cookbook or handbook, but definitely a wonderful masterpiece!

    The book may be a good source of wisdom or advice for new projects and even serve as a guide to resolving issues or improving poorly performing search applications.

    Hrishikesh covers a wide (be warned, it is wide, and in a good sense of the word) variety of topics in his work:
    Data processing with Apache Solr (techniques)
    Enterprise search design principals
    Integration examples (Java, Drupal and more)
    Distributed search, leveraging the Cloud
    Making Solr scalable
    Monitoring of Solr including optimization
    Integration with Big Data pillars as Hadoop, Zookeeper, Katta
    NoSQL: MongoDB and Cassandra
    Data analysis with R
    As a bonus, the author covers fixes to the most common pitfalls or errors.
    Blew my expectations! THIS IS the book you want to keep on your bookshelf and electronic media.

    After reading this book you can be assured to sail with fear the high waters of the Big Data ocean!

    5 starts + out of five!

    0

    Help other customers find the most helpful reviews 

    Was this review helpful to you? Yes
    No

Leave a Reply

Your email address will not be published. Required fields are marked *