Create an user solr
.
sudo useradd -r -s /bin/bash solr
Change several kernel parameters in /etc/security/limits.conf
.
solr hard nofile 65535
solr soft nofile 65535
solr hard nproc 65535
solr soft nproc 65535
Turn off swaps.
sudo swapoff -a
Relogin so that the change will takes a place.
Install Java.
sudo apt install -y openjdk-11-jdk
sudo su
apt update
apt upgrade -y
apt install lsof
cd /opt
curl https://apache.mirror.digionline.de/lucene/solr/8.8.0/solr-8.8.0.tgz -O
tar xzf solr-8.8.0.tgz solr-8.8.0/bin/install_solr_service.sh --strip-components=2
bash ./install_solr_service.sh solr-8.8.0.tgz
systemctl stop solr
chown -R solr /opt/solr-8.8.0
chgrp -R solr /opt/solr-8.8.0
Open port 8983 inside the SolrCloud cluster and your local client.
You can run and stop solr with systemctl (start|stop) solr
, but just check the commands for sure.
As solr
user,
# Start
/opt/solr$ ./bin/solr start
# Stop
/opt/solr$ ./bin/solr stop
Now, You can check Solr from browser http://{{ your_solr_server_domain }}:8983/solr/
So far, we run solr single instance.
cd /opt/solr-8.8.0
bin/solr -e cloud
2 Java process are running at different port in a server.
At the node1,
sudo su - solr
cd /opt/solr-8.8.0
mkdir -p example/cloud/node1/solr
cp server/solr/solr.xml example/cloud/node1/solr
./bin/solr start -cloud -s example/cloud/node1/solr -p 8983 -z localhost:2181
example/cloud/node1/solr
is arbitrary created directory for SolrCloud Home dir.
At the node2,
sudo su - solr
cd /opt/solr-8.8.0
mkdir -p example/cloud/node2/solr
cp server/solr/solr.xml example/cloud/node2/solr
./bin/solr start -cloud -s example/cloud/node2/solr -p 8983 -z localhost:2181
At the node3,
sudo su - solr
cd /opt/solr-8.8.0
mkdir -p example/cloud/node3/solr
cp server/solr/solr.xml example/cloud/node3/solr
./bin/solr start -cloud -s example/cloud/node3/solr -p 8983 -z localhost:2181
config set????
chmod u+x solr-8.8.0/server/scripts/cloud-scripts/zkcli.sh
./server/scripts/cloud-scripts/zkcli.sh -zkhost solrnode1.com -cmd upconfig -confname _default -confdir server/solr/configsets/_default/conf
doc.lucidworks.com/lucidworks-hdpsearch/2.5/Guide-Solr.html
Terminology: Cores, Collections & Nodes There are several terms that are used to describe parts of a SolrCloud implementation, and it’s helpful to try to understand them early:
Core - kind of index, scheme A single Solr instance, which represents a single Solr index. A core has a different set of configuration files and schema definitions than other cores.
Document - kind of record. under document there is field.
Collection - logical index of SolrCloud cluster. A group of cores that together form a single logical index. A collection has a different set of configuration files and schema definitions than other collections. (my word: a single collection could be distributed)
Shard A logical section of a single collection.
Node A Java Virtual Machine instance running Solr, commonly known as a server. Multiple cores can run on a node if you wish.
browser, collections, sample_collection, _default
, 2,2,show advanced, 2.
https://mkyong.com/solr/apache-solr-hello-world-example/
3.2 What is a Solr Core? In Apache Solr, a Solr Core is also known as simply “Core”. A Core is an Index of texts and fields available in all documents. One Solr Instance can contain one or more Solr Cores. In other words, a Solr Core = an instance of Apache Lucene Index + Solr Configuration (solr.xml,solrconfig.xml etc.)
3.3 What is Indexing? In Apache Lucene or Solr, Indexing is a technique of adding Document’s content to Solr Index so that we can search them easily. Apache Solr uses Apache Lucene Inverted Index technique to Index it’s documents. That’s why Solr provides very fast searching feature.
3.4 What is a Document? In Apache Solr, a Document is a group of fields and their values. Documents are the basic unit of data we store in Apache Cores. One core can contain one or more Documents.
3.5 What is a Field? In Apache Solr, a Field is actual data stored in a Document. It is a key & value pair. Key indicates the field name and value contains that Field data. One Document can contain one or more Fields. Apache Solr uses this Field data to index the Docuemnt Content.
Good terminology https://cwiki.apache.org/confluence/display/SOLR/SolrTerminology
Config Set: A set of config files necessary for a core to function properly. Each config set has a name. At minimum this will consist of solrconfig.xml (SolrConfigXml) and schema.xml (SchemaXml),
My memo A document is returned as JSON, and its key-value paires are fields.
create Core.
./bin/solr create -c Solr_sample
https://cwiki.apache.org/confluence/display/SOLR/SolrTerminology https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781783553235/1/ch01lvl1sec10/the-solr-architecture-and-directory-structure https://www.intra-mart.jp/document/library/iap/public/im_contents_search/solr_administrator_guide/texts/about/index.html
*** [WARN] *** Your open file limit is currently 1024.
It should be set to 65000 to avoid operational disruption.
If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
*** [WARN] *** Your Max Processes Limit is currently 15537.
It should be set to 65000 to avoid operational disruption.
If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
Error: Could not find or load main class org.apache.solr.util.SolrCLI
Caused by: java.lang.ClassNotFoundException: org.apache.solr.util.SolrCLI
In case of Solr (not SolrCloud) core can be backed up as follows (put it to any browser).
http://solrnode.com:8983/solr/{{ name_of_a_core }}/replication?command=backup
You could get response like below.
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1093</int>
</lst>
<str name="status">OK</str>
</response>
The backup data was stored like {{ path_to_solr_instance }}/{{ name_of_a_core }}/data/snapshot.{{ datetime_info }}
https://www.youtube.com/watch?v=Zw4M4NGv-Rw
<field>
tag