jebberwocky
MCP Serverjebberwockypublic

pokkoa baize yingzhao

An open-source project to simplify access to ancient Chinese divination datasets, offering various methods such as HTTP requests. 🚀 It is licensed under the MIT License and is free to use. 🎉 Any feedback is welcome! 💬

Repository Info

1
Stars
0
Forks
1
Watchers
0
Issues
Python
Language
MIT License
License

About This Server

An open-source project to simplify access to ancient Chinese divination datasets, offering various methods such as HTTP requests. 🚀 It is licensed under the MIT License and is free to use. 🎉 Any feedback is welcome! 💬

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

Pokkoa Yingzhao

The Pokkoa Yingzhao Text Matching System is a Python-based tool designed to process Chinese text documents, vectorize their content using TF-IDF, and find the most relevant texts matching user queries. It leverages SQLite for storing vectors and provides a simple command-line interface for interactive queries.

!Project Image

Dataset

Ancient Chinese books txt files arelocated in the txt/ directory.

The txt directory contains:

  • 49 volumes of Yi Jing (易经) texts, including various commentaries and interpretations
  • 146 volumes of Shu Shu (术数) texts, including divination methods, fortune-telling, and geomancy
  • Various other ancient Chinese texts covering topics such as:
    • Divination systems (六壬, 奇门遁甲)
    • Geomancy (风水) and burial practices
    • Physiognomy (相术) and fortune-telling
    • Astronomical and calendrical systems
    • Military strategies and tactics
    • Philosophical interpretations of the Yi Jing

These texts span multiple dynasties from pre-Qin period through the Qing dynasty, with authors including prominent figures like Zhu Xi (朱熹), Su Dongpo (苏东坡), Yang Xiong (杨雄), and many others.

Features

  • TF-IDF Vectorization: Uses sklearn's TfidfVectorizer to transform text into vectors.
  • Chinese Text Segmentation: Utilizes jieba for Chinese word segmentation.
  • SQLite Storage: Stores document vectors and the TF-IDF vectorizer in an SQLite database.
  • Similarity Matching: Computes cosine similarity between query input and document vectors.
  • Interactive CLI: Allows real-time querying and result display.
  • Debug Mode: Offers detailed logging for processing steps.
  • Support Http, gRPC, MCP(Model Context Protocol)

Stop Words

put under stopwords\stop_words.txt stop_words from https://github.com/elephantnose/characters

Installation

Ensure you have Python installed (>= 3.8), then install the necessary dependencies:

pip install numpy jieba scikit-learn

Usage

  1. Initialize the system:
from text_matching import TextMatchingSystem

# Enable debug mode for detailed logging
system = TextMatchingSystem(debug=True)
  1. Build vectors from a directory of text files:

Ensure you have a directory (e.g., ./txt) containing .txt files.

system.build_vectors_from_directory('./txt')
  1. Find relevant texts for a query:
results = system.find_relevant_text('你的查询文本', top_n=3)
for result in results:
    print(f"{result['filename']} (Similarity: {result['similarity']:.4f})")
    print(result'text')
  1. Add new documents dynamically:
system.add_new_document('new_file.txt', '这是新的文档内容。')
  1. Get database statistics:
stats = system.get_database_stats()
print(stats)

Running the CLI

You can run the provided CLI by executing the following command:

python text_matching.py

Follow the prompts to build vectors, check database stats, and query texts interactively.

Database Structure

The SQLite database (text_vectors.db) contains:

  • vectorizer table: Stores the serialized TF-IDF vectorizer.
  • document_vectors table: Stores document content and their corresponding vectors.

Debugging

Enable debug mode for verbose logging by initializing the system with:

system = TextMatchingSystem(debug=True)

License

This project is licensed under the MIT License. Feel free to use and modify it.


For any questions or feature requests, please open an issue or reach out!

About Pokkoa

  • Pokkoa website: pokkoa.com
  • Linkedin: Pokkoa LinkedIn
  • Hugging Face: Pokkoa on Hugging Face
  • ✉️: contact@pokkoa.cc

Quick Start

1

Clone the repository

git clone https://github.com/jebberwocky/pokkoa-baize-yingzhao
2

Install dependencies

cd pokkoa-baize-yingzhao
npm install
3

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownerjebberwocky
Repopokkoa-baize-yingzhao
LanguagePython
LicenseMIT License
Last fetched8/10/2025

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat
🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas
🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata
🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation