Sabari H

Software engineer specializing in optimization and deployment of state-of-the-art AI models on specialized hardware. Focused on model profiling, bottleneck analysis, and inference acceleration through quantization, pruning, and architecture-specific optimizations.

Experience

Multicoreware Inc Jun 2025 — Present

Software Developer

Deploying state-of-the-art LLMs on AI accelerators for high-speed inference. Working on model profiling, bottleneck analysis, and optimization techniques including quantization, pruning, and architecture-specific tuning.

Multicoreware Inc Dec 2024 — Jun 2025

Junior Software Developer (Intern)

Quantized and compiled AI models targeting specific hardware architectures. Focused on robotic perception models and large language models.

Finequs Jan 2024 — Mar 2024

Software Developer Intern

Built a Selenium-based automation system that reduced data entry tasks from hours to minutes. Integrated the Tata Telecommunication API into the main product dashboard.

Projects

Heavy Hitter Oracle in vLLM

Implemented Heavy Hitter Oracle, a dynamic sparse KV-cache mechanism, in the vLLM inference engine. Achieved 20–30% speedup at equivalent sparsity levels by selectively retaining high-attention keys during decoding.

vLLM LLM Inference Sparse Attention

Decentralized EHR System

Blockchain-based electronic health records system built during the PLI Hackathon at Sathyabama Institute of Technology. Won 50k XDC tokens in rewards.

Blockchain XDC Hackathon

Decentralized VPN

Designed a decentralized VPN system where users act as relay nodes, earning incentives while providing privacy to participants across the network.

Networking P2P Privacy