Welcome to my GitHub! I'm a Data guy (analytics/engineering/science and little bit of AI) with a Masterβs in Advanced Data Analytics and a solid foundation in Data Analytics, Data Science, Data Engineering, MLOps, and Business Analytics with keen interest in AI Applications. Iβm passionate about building data-driven solutions that drive growth, innovation, and operational efficiency. My background spans data architecture, scalable ML pipelines, cloud computing, and actionable insights that help teams make strategic decisions.
- β‘ Former Product Lead at Cirrus Nexus (Cumulus Nexus India Pvt Ltd)
- π¨βπ» Experienced in Python, R, SQL, Rust, C++, Go, Terraform, and advanced ML frameworks like TensorFlow, PyTorch, and Scikit-Learn
- βοΈ Proficient in Cloud Platforms: AWS (SageMaker, Glue, Redshift, Lambda), Azure (Data Factory, Synapse, HDInsight, ML Studio), GCP (BigQuery, Looker, Vertex AI Platform); Certified in AWS, Azure, GCP, and Kubernetes
- π Skilled in Data Engineering (ETL, Data Modeling, Real-Time Streaming), MLOps (CI/CD, Model Deployment), and Data Science (Predictive Modeling, NLP, Computer Vision)
- π¬ Advocate for Cloud Cost Optimization strategies, helping companies cut costs while improving performance through structured planning
- π€ Specialized in Natural Language Processing, Large Language Models (LLMs), Retrieval Augmented Generation (RAG), FAISS, AI Agents, and Vector Databases
- Data Engineering & Big Data Pipelines β Architecting and optimizing ETL pipelines for large-scale data processing with Apache Spark, Flink, Superset, Dagster, Druid,Delta lakee,dbt,Airflow, Snowflake, and Fivetran
- MLOps Pipelines β Building end-to-end ML pipelines with Kubernetes, Docker, Jenkins, and Kubeflow to automate model training and deployment, with a focus on scalability and CI/CD workflows
- Generative AI & NLP Models β Developing cutting-edge models for NLP, including language models and sentiment analysis, using transformer architectures
- Cloud Infrastructure Optimization β Implementing efficient infrastructure using Terraform and IaC (Infrastructure as Code) to optimize cloud resources on AWS, Azure, and GCP
- Generative AI & LLMs β Building production-ready LLM applications using LangChain, LlamaIndex, and Vector Databases (Pinecone, Weaviate, Milvus). Implementing RAG pipelines with custom knowledge bases and hybrid search strategies. Designing AI Agents using CrewAI, Weaviate and other tools.
- Scaling Machine Learning Operations β Expanding knowledge in MLflow, Argo, and advanced MLOps for seamless deployment and monitoring of ML models
- Distributed Systems & Real-Time Analytics β Exploring Apache Flink, Kafka, and Delta Lake for real-time analytics and streaming solutions
- Advanced Data Engineering β Diving deeper into data warehouse and data lake architecture, leveraging platforms like Snowflake and Databricks
- Advanced LLM Engineering β Exploring LLM fine-tuning, prompt engineering, and context window optimization techniques for enterprise applications
- Tools & Platforms: Apache Spark, Kafka, Hadoop, Snowflake, Databricks, Apache Airflow, Fivetran, dbt
- Cloud & Big Data: AWS (Lambda, Glue, RDS, S3, EMR, Redshift), Azure Data Factory, Azure Databricks, Azure Synapse, GCP BigQuery, Snowflake
- Skills: Data Pipeline Design, ETL Optimization, Data Modeling, Real-Time Data Streaming
- Languages & Libraries: Python, R, Julia, Scala, Java, SQL, Scikit-Learn, TensorFlow, PyTorch, PySpark, Keras, Pandas, Dask
- Specializations: Predictive Modeling, Time Series, NLP, Deep Learning, Hyperparameter Tuning, Computer Vision
- MLOps Tools: Docker, Kubernetes, Jenkins, MLflow, Kubeflow, Argo, Terraform, GitHub Actions
- CI/CD & Automation: CI/CD Pipelines, Model Versioning, Model Deployment, Monitoring & Logging
- Visualization Tools: Power BI, Tableau, Plotly, Matplotlib, ggplot2
- Business Tools: JIRA, Confluence, Lucidchart, Microsoft Visio, Business Process Mapping, Requirements Analysis
- Frameworks: LangChain, LlamaIndex, Semantic Kernel, OpenAI API, Anthropic API
- Vector Databases: Pinecone, Weaviate, Milvus, Chroma, FAISS
- Skills: RAG Pipeline Design, Prompt Engineering, LLM Fine-tuning, Embedding Optimization, Context Window Management
- Data Engineering & Cloud:
- AWS Cloud Data Engineer, Azure Data Engineer, Google Cloud Professional Data Engineer, SnowPro Core, Meta Database Engineer
- Machine Learning & Data Science:
- TensorFlow Developer, AWS Certified Machine Learning Specialty, IBM Data Science Professional
- MLOps & DevOps:
- Certified Kubernetes Administrator, Terraform Associate, Databricks Certified for Apache Spark
- Tools: R, SQL, Tableau, ETL
- Summary: Advanced to Round 2 among 400 teams by designing KPIs to track healthcare patient engagement, creating impactful insights for targeted health improvement.
- Tools: Kafka, AWS Lambda, Spark
- Summary: Built a real-time data streaming architecture to process and analyze data instantly, achieving 99.9% system availability and reducing latency for business-critical decisions.
- Tools: Python, Scikit-Learn, AWS
- Summary: Developed a predictive model with 86.2% accuracy to forecast customer churn, allowing for proactive retention strategies and enhancing customer engagement.
- Tools: Python, Apache Airflow, AWS SageMaker
- Summary: Created an ML pipeline automating data preprocessing, model training, and deployment, reducing operational costs by 14% while maintaining high model performance.
- Tools: LangChain, OpenAI API, ChromaDB
- Summary: Built a Q&A system over internal documentation using RAG, achieving 85% query relevance while reducing response time by 60% compared to manual searches.
- π« Email: [email protected]
- πΌ LinkedIn: linkedin.com/in/chaitanyavankadaru
- π Blog/Newsletter: Coming soon, where I'll share insights on data engineering, MLOps, and AI-driven strategies!
- β Tea over Coffee! Extra fuel for complex problem-solving.
- π² Avid puzzle solver and lover of challenging data problems.
- πΎ I enjoy exploring the latest in Generative AI and contributing to open-source projects.
Thanks for stopping by my profile! Feel free to explore my repos, and letβs collaborate if you share similar interests or need insights on cloud and AI solutions.