lakehq
MCP Serverlakehqpublic

sail

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Repository Info

881
Stars
44
Forks
881
Watchers
106
Issues
Rust
Language
Apache License 2.0
License

About This Server

LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

Sail

Build Status PyPI Release PyPI Downloads Static Slack Badge

The mission of Sail is to unify stream processing, batch processing, and compute-intensive (AI) workloads. Currently, Sail features a drop-in replacement for Spark SQL and the Spark DataFrame API in both single-host and distributed settings.

✨Please check out our MCP server that brings data analytics in Spark to both LLM agents and humans!✨

Installation

Sail is available as a Python package on PyPI. You can install it along with PySpark in your Python environment.

pip install pysail
pip install "pyspark[connect]"

Alternatively, you can install the lightweight client package pyspark-client since Spark 4.0. The pyspark-connect package, which is equivalent to pyspark[connect], is also available since Spark 4.0.

The Installation guide contains more information about installing Sail from source for better performance for your hardware architecture.

Getting Started

Starting the Sail Server

Option 1: Command Line Interface You can start the local Sail server using the sail command.

sail spark server --port 50051

Option 2: Python API You can start the local Sail server using the Python API.

from pysail.spark import SparkConnectServer

server = SparkConnectServer(port=50051)
server.start(background=False)

Option 3: Kubernetes You can deploy Sail on Kubernetes and run Sail in cluster mode for distributed processing. Please refer to the Kubernetes Deployment Guide for instructions on building the Docker image and writing the Kubernetes manifest YAML file.

kubectl apply -f sail.yaml
kubectl -n sail port-forward service/sail-spark-server 50051:50051

Connecting to the Sail Server

Once you have a running Sail server, you can connect to it in PySpark. No changes are needed in your PySpark code!

from pyspark.sql import SparkSession

spark = SparkSession.builder.remote("sc://localhost:50051").getOrCreate()
spark.sql("SELECT 1 + 1").show()

Please refer to the Getting Started guide for further details.

Documentation

The documentation of the latest Sail version can be found here.

Further Reading

  • Supercharge Spark: Quadruple Speed, Cut Costs by 94% - This post presents detailed benchmark results comparing Sail with Spark.
  • Sail 0.2 and the Future of Distributed Processing - This post discusses the Sail distributed processing architecture.

Contributing

Contributions are more than welcome!

Please submit GitHub issues for bug reports and feature requests. You are also welcome to ask questions in GitHub discussions.

Feel free to create a pull request if you would like to make a code change. You can refer to the development guide to get started.

Support

LakeSail offers flexible enterprise support options for Sail. Please contact us to learn more.

Quick Start

1

Clone the repository

git clone https://github.com/lakehq/sail
2

Install dependencies

cd sail
npm install
3

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownerlakehq
Reposail
LanguageRust
LicenseApache License 2.0
Last fetched8/10/2025

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat
🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas
🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata
🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation