🤖 GPT-4 Sentiment: Bullish 0.73 | BERT Model Accuracy: 94.2% | Transformer Loss: 0.0234

📊 Sharpe Ratio: 1.84 | Alpha: 0.067 | Beta: 0.92 | Vol: 14.3% | Max DD: -8.1%

⚡ HFT Latency: 12μs | Order Fill: 99.97% | Slippage: 0.02bps | PnL: +$47.2K

🧠 ML Models: Random Forest 87.3% | XGBoost 91.2% | LSTM 89.7% | Ensemble 93.1%

📈 Options Greeks: Δ=0.65 Γ=0.03 Θ=-0.08 ν=0.25 ρ=0.12 | IV: 24.6%

💎 DeFi APY: Uniswap 12.5% | Compound 8.3% | Aave 6.7% | Curve 15.2%

🔮 AI Predictions: Price Target $485 | Probability: 68% | Confidence: High

⚡ Quant Signals: Mean Reversion ✓ | Momentum ✗ | Pairs Trading ✓ | Arbitrage ✓

🌐 LLM Performance: Token/sec: 45 | BLEU Score: 0.89 | Perplexity: 12.4

📊 Risk Metrics: VaR(95%): -2.3% | CVaR: -3.8% | Correlation Matrix Updated

Hi, I'm Siddarth

Quantitative Researcher & Data Scientist specializing in algorithmic trading strategies, financial modeling, and AI-driven market analysis to unlock alpha in modern financial markets.

View My Work Get In Touch

My Projects

A showcase of my recent work and contributions

Customer Analytics Dashboard

Interactive dashboard for customer behavior analysis using Python, Pandas, and Plotly.

Python Pandas Plotly

Predictive Analytics Model

Machine learning model for sales forecasting with 95% accuracy using scikit-learn.

Python Scikit-learn TensorFlow

Portfolio Tracker App

Full-stack web application for tracking investment portfolios with real-time data.

React Node.js MongoDB

About Me

Get to know me better

I'm a passionate Data Scientist with a strong background in machine learning, statistical analysis, and full-stack development. My journey in technology began with a curiosity about how data can tell stories and drive meaningful decisions.

With expertise in Python, R, and modern web technologies, I bridge the gap between data science and practical applications. I love creating end-to-end solutions that not only analyze data but also present it in user-friendly interfaces.

I'm particularly interested in quantitative finance and the application of data science techniques to financial markets. Additionally, I'm passionate about generative AI and large language models (LLMs), exploring how these cutting-edge technologies can revolutionize financial analysis and decision-making. While I'm building experience in these domains, I'm eager to explore algorithmic trading, risk modeling, AI-powered financial analysis, and automated trading systems.

When I'm not coding or diving deep into datasets, you'll find me exploring new technologies, studying financial markets, contributing to open-source projects, or sharing knowledge with the developer community.

Technical Skills

Data Science

Python (Pandas, NumPy, Scikit-learn)
Machine Learning & Deep Learning
Statistical Analysis & Time Series
Data Visualization (Matplotlib, Plotly)

Development

JavaScript (React, Node.js)
HTML5 & CSS3
Database Design (SQL, MongoDB)
Version Control (Git)

Quantitative Finance (Learning)

Financial Data Analysis
Risk Modeling & Portfolio Theory
Algorithmic Trading Concepts
Financial Libraries (QuantLib, yfinance)

AI & ML Technologies

Large Language Models (LLMs)
Generative AI & Transformers
Natural Language Processing
AI Model Fine-tuning

Tools & Platforms

Jupyter Notebooks & Google Colab
Docker & Cloud Platforms (AWS, Azure)
Financial APIs (Alpha Vantage, Yahoo)
OpenAI API & Hugging Face

Professional Experience

My journey in data science and technology

Data Engineer/Scientist

April 2025 - Present

Hitachi Global Air Power

Applied Recursive Feature Elimination (RFE) with Random Forest and SVM on sales data, reducing 500+ features to 80 key predictors, resulting in 25% reduction in processing time while maintaining 99% prediction accuracy
Leveraged PySpark to extract and process 10,000+ financial PDFs (200GB) and built OCR model (CNN-LSTM) on Azure Cloud using Keras and TensorFlow, achieving 98.5% accuracy with CTC loss metrics
Deployed and tested multiple Deep Learning models (Seq2Seq, BERT, DistilBERT) on preprocessed financial data, choosing DistilBERT for its 95.2% accuracy and 50% faster recommendations for sales forecasting

Azure ML Databricks OpenAI API LangChain GitHub Actions

Data Analyst

February 2024 - March 2025

Hitachi Global Air Power

Utilized Python, PySpark, SQL, and Power BI to preprocess and visualize 20GB of financial transaction data, highlighting anomaly patterns and uncovering significant correlations (0.85) between payment behaviors
Created NLP pipelines using Spacy, Gensim, and NLTK for customer feedback preprocessing, creating streamlined data processing system and cut down processing time by 30% through advanced AI techniques
Conducted feature engineering on customer behavior data, evaluating individual feature contribution to risk categories leveraging methods such as one-vs-all (Random Forest) and SHAP values

XGBoost Random Forest Azure Text Analytics BERT NLP

Data Engineer

July 2022 - December 2022

Pontoon Solutions

Designed and developed priority-based fair-share scheduling algorithms for distributed workforce management systems, optimizing resource allocation and improving system throughput by 35% across multiple client environments
Built scalable ETL workflows using Apache Airflow and Kafka, processing 500GB+ weekly workforce and payroll data with 99.9% uptime, supporting compliance and reporting across 15+ client data sources
Collaborated with business intelligence team to apply data engineering techniques enabling discovery of 12 new workforce optimization patterns, unlocking new avenues for talent acquisition and placement strategies

Airflow Kafka dbt AWS Snowflake Terraform

Education & Key Projects

2021 - 2023

Northeastern University

Developed scalable quantitative research application using Streamlit, supporting simultaneous analysis of 5+ financial datasets and chunking market data into optimized segments using LangChain and Large Language Models
Engineered logistic regression and decision tree models for credit risk assessment, achieving 85% accuracy on dataset of 500,000 borrowers, validated through z-tests and chi-square tests at 95% confidence level
Identified top 8 market risk factors explaining 75% variance via ANOVA analysis, leading to 20% reduction in portfolio risk through targeted quantitative strategies and optimized trading policies

GPT-4 LangChain FAISS OpenAI API RAG

Technical Proficiency

Python 95%

Machine Learning 90%

SQL & Databases 85%

Azure & Cloud Platforms 92%

LLM & Generative AI 88%

Data Engineering & ETL 90%

Get In Touch

Let's discuss opportunities and collaborations

Let's Connect

I'm always open to discussing new opportunities, interesting projects, or just having a chat about technology and data science.

Email

siddarthsivasubramanian@gmail.com

LinkedIn

siddarth-sivasubramanian

GitHub

github.com/siddarth-ss