ðŸĪ– GPT-4 Sentiment: Bullish 0.73 | BERT Model Accuracy: 94.2% | Transformer Loss: 0.0234
📊 Sharpe Ratio: 1.84 | Alpha: 0.067 | Beta: 0.92 | Vol: 14.3% | Max DD: -8.1%
⚡ HFT Latency: 12ξs | Order Fill: 99.97% | Slippage: 0.02bps | PnL: +$47.2K
🧠 ML Models: Random Forest 87.3% | XGBoost 91.2% | LSTM 89.7% | Ensemble 93.1%
📈 Options Greeks: Δ=0.65 Γ=0.03 Θ=-0.08 Î―=0.25 ρ=0.12 | IV: 24.6%
💎 DeFi APY: Uniswap 12.5% | Compound 8.3% | Aave 6.7% | Curve 15.2%
ðŸ”Ū AI Predictions: Price Target $485 | Probability: 68% | Confidence: High
⚡ Quant Signals: Mean Reversion ✓ | Momentum ✗ | Pairs Trading ✓ | Arbitrage ✓
🌐 LLM Performance: Token/sec: 45 | BLEU Score: 0.89 | Perplexity: 12.4
📊 Risk Metrics: VaR(95%): -2.3% | CVaR: -3.8% | Correlation Matrix Updated

Hi, I'm Siddarth

Quantitative Researcher & Data Scientist specializing in algorithmic trading strategies, financial modeling, and AI-driven market analysis to unlock alpha in modern financial markets.

Siddarth Sivasubramanian

My Projects

A showcase of my recent work and contributions

Customer Analytics Dashboard

Interactive dashboard for customer behavior analysis using Python, Pandas, and Plotly.

Python Pandas Plotly

Predictive Analytics Model

Machine learning model for sales forecasting with 95% accuracy using scikit-learn.

Python Scikit-learn TensorFlow

Portfolio Tracker App

Full-stack web application for tracking investment portfolios with real-time data.

React Node.js MongoDB

About Me

Get to know me better

I'm a passionate Data Scientist with a strong background in machine learning, statistical analysis, and full-stack development. My journey in technology began with a curiosity about how data can tell stories and drive meaningful decisions.

With expertise in Python, R, and modern web technologies, I bridge the gap between data science and practical applications. I love creating end-to-end solutions that not only analyze data but also present it in user-friendly interfaces.

I'm particularly interested in quantitative finance and the application of data science techniques to financial markets. Additionally, I'm passionate about generative AI and large language models (LLMs), exploring how these cutting-edge technologies can revolutionize financial analysis and decision-making. While I'm building experience in these domains, I'm eager to explore algorithmic trading, risk modeling, AI-powered financial analysis, and automated trading systems.

When I'm not coding or diving deep into datasets, you'll find me exploring new technologies, studying financial markets, contributing to open-source projects, or sharing knowledge with the developer community.

Technical Skills

Data Science

  • Python (Pandas, NumPy, Scikit-learn)
  • Machine Learning & Deep Learning
  • Statistical Analysis & Time Series
  • Data Visualization (Matplotlib, Plotly)

Development

  • JavaScript (React, Node.js)
  • HTML5 & CSS3
  • Database Design (SQL, MongoDB)
  • Version Control (Git)

Quantitative Finance (Learning)

  • Financial Data Analysis
  • Risk Modeling & Portfolio Theory
  • Algorithmic Trading Concepts
  • Financial Libraries (QuantLib, yfinance)

AI & ML Technologies

  • Large Language Models (LLMs)
  • Generative AI & Transformers
  • Natural Language Processing
  • AI Model Fine-tuning

Tools & Platforms

  • Jupyter Notebooks & Google Colab
  • Docker & Cloud Platforms (AWS, Azure)
  • Financial APIs (Alpha Vantage, Yahoo)
  • OpenAI API & Hugging Face

Professional Experience

My journey in data science and technology

Data Engineer/Scientist

April 2025 - Present

Hitachi Global Air Power

  • Applied Recursive Feature Elimination (RFE) with Random Forest and SVM on sales data, reducing 500+ features to 80 key predictors, resulting in 25% reduction in processing time while maintaining 99% prediction accuracy
  • Leveraged PySpark to extract and process 10,000+ financial PDFs (200GB) and built OCR model (CNN-LSTM) on Azure Cloud using Keras and TensorFlow, achieving 98.5% accuracy with CTC loss metrics
  • Deployed and tested multiple Deep Learning models (Seq2Seq, BERT, DistilBERT) on preprocessed financial data, choosing DistilBERT for its 95.2% accuracy and 50% faster recommendations for sales forecasting
Azure ML Databricks OpenAI API LangChain GitHub Actions

Data Analyst

February 2024 - March 2025

Hitachi Global Air Power

  • Utilized Python, PySpark, SQL, and Power BI to preprocess and visualize 20GB of financial transaction data, highlighting anomaly patterns and uncovering significant correlations (0.85) between payment behaviors
  • Created NLP pipelines using Spacy, Gensim, and NLTK for customer feedback preprocessing, creating streamlined data processing system and cut down processing time by 30% through advanced AI techniques
  • Conducted feature engineering on customer behavior data, evaluating individual feature contribution to risk categories leveraging methods such as one-vs-all (Random Forest) and SHAP values
XGBoost Random Forest Azure Text Analytics BERT NLP

Data Engineer

July 2022 - December 2022

Pontoon Solutions

  • Designed and developed priority-based fair-share scheduling algorithms for distributed workforce management systems, optimizing resource allocation and improving system throughput by 35% across multiple client environments
  • Built scalable ETL workflows using Apache Airflow and Kafka, processing 500GB+ weekly workforce and payroll data with 99.9% uptime, supporting compliance and reporting across 15+ client data sources
  • Collaborated with business intelligence team to apply data engineering techniques enabling discovery of 12 new workforce optimization patterns, unlocking new avenues for talent acquisition and placement strategies
Airflow Kafka dbt AWS Snowflake Terraform

Education & Key Projects

2021 - 2023

Northeastern University

  • Developed scalable quantitative research application using Streamlit, supporting simultaneous analysis of 5+ financial datasets and chunking market data into optimized segments using LangChain and Large Language Models
  • Engineered logistic regression and decision tree models for credit risk assessment, achieving 85% accuracy on dataset of 500,000 borrowers, validated through z-tests and chi-square tests at 95% confidence level
  • Identified top 8 market risk factors explaining 75% variance via ANOVA analysis, leading to 20% reduction in portfolio risk through targeted quantitative strategies and optimized trading policies
GPT-4 LangChain FAISS OpenAI API RAG

Technical Proficiency

Python 95%
Machine Learning 90%
SQL & Databases 85%
Azure & Cloud Platforms 92%
LLM & Generative AI 88%
Data Engineering & ETL 90%

Get In Touch

Let's discuss opportunities and collaborations

Let's Connect

I'm always open to discussing new opportunities, interesting projects, or just having a chat about technology and data science.