AI and Document Search
LLM workflows, PDF processing, retrieval systems, citations, and practical document automation.
AI · Agents · Cloud · LLM
AI tools to make sense of millions of documents. RAG pipelines based on millions of embeddings stored in vector databases. Run latest LLMs in efficient and cost effective mode.
Solid aws cloud pipelines to ingest data.
Focused on AI applications, data engineering, cloud infrastructure, backend systems, and minimal user interfaces. More than a decade of experience in Python, AWS cloud.
Turn millions of messy documents, raw data, and manual workflows into searchable systems, useful dashboards, and automated cloud-based applications. Makes sense of huge corpus of data using RAG pipelines powered AI based decisions.
Group data and find unique characteristics like persona extraction etc using complex clustering algorithms.
AI/ML related programming and infrastructure services.
LLM workflows, PDF processing, retrieval systems, citations, and practical document automation.
SQL analysis, spend reports, operational metrics, dashboards, ETL scripts, and decision-ready summaries.
Github action, APIs, background jobs, AWS deployment pipelines, cloud infrastructure, containers, and secure automation.
Four representative projects showing practical work in AI, analytics, cloud automation, and devops.
A document search and summarization workflow for extracting reliable answers from large PDF collections with page-level extract citations and concise final reports. Uses Milvus vector database holding large number of embeddings.
A corpus of more than millions of records was imported from freely available opensource census/sales data etc. A solid monitored pipeline was created using aws resources. Finally clustering algorithm was used to group similar characteristics and then RAG.
Github action, Infrastructure automation for preview environments, background jobs, container workloads, secure deployment pipelines, and cloud cost control. AWS secrets manager, code checking, logging.
AWS step function used to orchestrate complex workflow and pipeline. CUDA drivers installed in AWS EC2 gpu for local execution of models. Monitoring using cloudwatch metrics using cloudwatch log parameters.
Available for short term and long term projects, consulting, technical reviews, AI prototypes, data dashboards, backend systems, and cloud devops automation work.