archive-backend

Backend for new archive site

  • 所有者: Bnei-Baruch/archive-backend
  • 平台:
  • 许可证: MIT License
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

Backend for new archive site

Overview

Backend for new archive site including, ETLs from BB Metadata DB to Elasticsearch.

Commands

The archive-backend is meant to be executed as command line.
Type archive-backend <command> -h to see how to use each command.

archive-backend server

Execute the backend api server for the new archive site.

archive-backend version

Print the version of archive-backend

Configuration

The default config file is config.toml in your current work directory.

See config.sample.toml for a sample config file.

Release and Deployment

Once development is done, all tests are green, we want to go live.
All we have to do is simply execute misc/release.sh.

To add a pre-release tag, add the relevant environment variable. For example,

PRE_RELEASE=rc.1 misc/release.sh

MDB models

When MDB schema is changed we need to update the mdb package. Run this script:

misc/update_mdb_models.sh

(See the next section below for the instructions on installing Elasticsearch for Windows)

http://mrzard.github.io/blog/2015/03/25/elasticsearch-enable-mlockall-in-centos-7/

Plugins

  1. Hebrew plugin:
    https://github.com/synhershko/elasticsearch-analysis-hebrew
  2. Instead of standard analyzer for exact match (הריון to be same as היריון):
sudo bin/elasticsearch-plugin install analysis-phonetic

https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-phonetic.html

WIP - Does not works yet.

  1. ICU plugin to transliterate Russian (and others) to enable phonetic on them:
sudo bin/elasticsearch-plugin install analysis-icu
  1. Ukrainial analyzer (fails for standard - Not started)

Build index

There are two more dependencies required to build index:

  1. Open Office (soffice binary) - to convert all doc to docx.
  2. python-docx pyton library - to get text from docx
  • pip install python-docx

Elasticsearch installation for Windows

  1. Download and install the Java Virtual Machine for Windows from
    http://www.oracle.com/technetwork/java/javase/downloads/jre8-downloads-2133155.html

alt text

  1. Download and install the Elasticsearch 5.6.0 MSI from
    https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.0.msi

  2. Open CMD as administrator

    1. Go to Elasticsearch bin directory

      cd C:\Program Files\Elastic\Elasticsearch\bin
      
    2. To install analysis-phonetic type

      elasticsearch-plugin install analysis-phonetic
      
    3. To install the hebrew plugin type

      elasticsearch-plugin install https://bintray.com/synhershko/elasticsearch-analysis-hebrew/download_file?file_path=elasticsearch-analysis-hebrew-5.6.0.zip
      
    4. Answer 'y' to the security question

      Continue with installation? [y/N]y
      
    5. To install ICU plugin type

      elasticsearch-plugin install analysis-icu
      
  3. Download and install Python - version 2.7.x
    https://www.python.org/downloads/

  4. Install python-docx (to get text from docx):

    • in CMD go to python directory
    cd C:\Python27
    
    • and type
    python -m pip install python-docx
    
  5. Download and install LibreOffice (not OpenOffice!)

    https://www.libreoffice.org/donate/dl/win-x86_64/5.4.5/en-US/LibreOffice_5.4.5_Win_x64.msi

    Update 'soffice-bin' value with soffice.exe full path in config.toml, [elasticsearch] section:
    "C://Program Files//LibreOffice 5//program//soffice.exe"

  6. Copy to config.toml the required commented-out lines from config.sample.toml that are related to Windows.

  7. Updating assets:

    In order to make correct data indexing you should update the ES mapping configuration files (JSON files in /data/es/mappings):

    1. Exec. \es\mappings\make.py with python from the root path of the project. For example:
      C:\Users\[USER]\go\src\github.com\Bnei-Baruch\archive-backend>python C:\Users\[USER]\go\src\github.com\Bnei-Baruch\archive-backend\es\mappings\make.py
      
    2. From the root path of the project, type:
      go-bindata -debug data/...
      
    3. Edit bindata.go file (located in the root folder) and replace "package main" with "package bindata".
    4. Move the modified bindata.go file to /bindata folder (delete old bindata.go from /bindata if exist and make sure the bindata.go is not exist any more in the root folder).
    5. Repeat this steps any time make.py is changed and executed.

Install go-bindata

go get -u github.com/jteeuwen/go-bindata/...

License

MIT

主要指标

概览
名称与所有者Bnei-Baruch/archive-backend
主编程语言Go
编程语言Go (语言数: 7)
平台
许可证MIT License
所有者活动
创建于2017-01-06 03:32:09
推送于2025-03-25 01:49:24
最后一次提交2025-03-25 03:30:24
发布数197
最新版本名称v1.13.0 (发布于 2021-07-05 15:05:01)
第一版名称v0.1.0-rc.1 (发布于 )
用户参与
星数11
关注者数14
派生数0
提交数2k
已启用问题?
问题数1
打开的问题数1
拉请求数257
打开的拉请求数20
关闭的拉请求数73
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?