mongo+monstache(分片)+es数据同步-config.toml
发布日期:2021-05-08 04:49:44 浏览次数:33 分类:精选文章

本文共 4606 字,大约阅读时间需要 15 分钟。

新建数据库product并启动分片:

use adminsh.enableSharding("product")db.runCommand({"shardcollection":"product.tags","key":{"_id":"hashed"}})#查看数据分布情况use productdb.tags.stats()############输出###############{	"sharded" : true,###############################

config-share-linux.toml

# connection settings# connect to MongoDB using the following URLmongo-url = "mongodb://root:123@192.168.1.100:4000/?authSource=admin"# connect to the Elasticsearch REST API at the following node URLselasticsearch-urls = ["http://192.168.1.101:9200"]# frequently required settings# if you need to seed an index from a collection and not just listen and sync changes events# you can copy entire collections or views from MongoDB to Elasticsearchdirect-read-namespaces = ["product.tags"]# if you want to use MongoDB change streams instead of legacy oplog tailing use change-stream-namespaces# change streams require at least MongoDB API 3.6+# if you have MongoDB 4+ you can listen for changes to an entire database or entire deployment# in this case you usually don't need regexes in your config to filter collections unless you target the deployment.# to listen to an entire db use only the database name.  For a deployment use an empty string.change-stream-namespaces = ["product.tags"]# additional settings# if you don't want to listen for changes to all collections in MongoDB but only a few# e.g. only listen for inserts, updates, deletes, and drops from mydb.mycollection# this setting does not initiate a copy, it is only a filter on the change event listener# namespace-regex = '^product\.tags$'# compress requests to Elasticsearchgzip = true# generate indexing statisticsstats = true# index statistics into Elasticsearchindex-stats = true# use the following PEM file for connections to MongoDB# mongo-pem-file = "/path/to/mongoCert.pem"# disable PEM validation# mongo-validate-pem-file = false# use the following user name for Elasticsearch basic auth# elasticsearch-user = "someuser"# use the following password for Elasticsearch basic auth# elasticsearch-password = "somepassword"# use 4 go routines concurrently pushing documents to Elasticsearch# elasticsearch-max-conns = 4 # use the following PEM file to connections to Elasticsearch# elasticsearch-pem-file = "/path/to/elasticCert.pem"# validate connections to Elasticsearch# elastic-validate-pem-file = true# propogate dropped collections in MongoDB as index deletes in Elasticsearchdropped-collections = true# propogate dropped databases in MongoDB as index deletes in Elasticsearchdropped-databases = true# do not start processing at the beginning of the MongoDB oplog# if you set the replay to true you may see version conflict messages# in the log if you had synced previously. This just means that you are replaying old docs which are already# in Elasticsearch with a newer version. Elasticsearch is preventing the old docs from overwriting new ones.replay = true# resume processing from a timestamp saved in a previous runresume = true# do not validate that progress timestamps have been savedresume-write-unsafe = false# override the name under which resume state is savedresume-name = "default1"# use a custom resume strategy (tokens) instead of the default strategy (timestamps)# tokens work with MongoDB API 3.6+ while timestamps work only with MongoDB API 4.0+resume-strategy = 1# exclude documents whose namespace matches the following pattern# namespace-exclude-regex = '^mydb\.ignorecollection$'# turn on indexing of GridFS file content# index-files = true# turn on search result highlighting of GridFS content# file-highlighting = true# index GridFS files inserted into the following collections# file-namespaces = ["users.fs.files"]# print detailed information including request tracesverbose = true# enable clustering mode# cluster-name = 'apollo'# do not exit after full-sync, rather continue tailing the oplogexit-after-direct-reads = false# enable-oplog=true# delete-strategy=1# delete-index-pattern="tags.tags"# elasticsearch-max-conns=15# bulk一次默认是8MB,改成4MB,减少批量数量,降低失败数量# elasticsearch-max-bytes= 4194304# workers = ["Tom", "Dick", "Harry", "Harry1", "Harry2", "Harry3", "Harry4", "Harry5", "Harry6", "Harry7", "Harry8", "Harry9", "Harry10"]fail-fast=trueelasticsearch-retry=true[logs]error = "/application/monstache/logs/error.log"[[script]]namespace = "product.tags"script = """module.exports = function(doc) {	doc.HasName=doc.Name!=null;		for(var key in doc){		if(key!='_id'		&&key!='Name'		&&key!='HasName'		){		     delete doc[key];		}	}    return doc;}"""

拷贝到服务器,并运行

cd /usr/local/monstachemonstache -f config-share-linux.toml

运行效果:

查看ES中数据是否同步:

上一篇:C++ 0506 引用和指针
下一篇:C++ 0505 析构函数

发表评论

最新留言

很好
[***.229.124.182]2025年04月14日 11时38分44秒