Case Studies
In-Vivo ML Kidney Stone Analysis
Key Results
- 98.8% composition accuracy
- Real-time video stream support
- Per-frame classification output
About the project
A US-based medical device manufacturer developing endoscopic imaging systems engaged PerformaCode to design and implement a proof-of-concept AI/ML pipeline for in-vivo kidney stone analysis.
The project focused on endoscopic video and image data captured from embedded camera platforms, with emphasis on detecting kidney stones in situ and classifying their composition using machine-learning models. The work was scoped explicitly as a research and feasibility stage, intended to de-risk downstream productization rather than deliver a clinical-ready system.
PerformaCode was responsible for end-to-end technical execution, including dataset analysis and preparation, model architecture selection and training, construction of a video-capable inference pipeline, and production of technical documentation and findings to support future development and regulatory planning.
7
Months
3
engineers
FP
delivery model
Client challenges
The client was evaluating ML-based analysis of in-vivo endoscopic imagery, where input conditions differ materially from curated medical image datasets. Video frames exhibited variable illumination, motion artifacts, fluid occlusion, and sensor noise, raising uncertainty around whether kidney stone detection and composition classification could be performed reliably at frame level rather than as offline batch analysis.
At the system level, key feasibility questions were unresolved. There was no validated approach for sourcing and structuring composition-labeled training data for endoscopic video, limited evidence that preliminary models would scale across image resolutions, and unclear boundaries between research models and an embedded, video-stream–capable inference pipeline. These gaps made it difficult to quantify technical risk or define a credible path toward future productization.
Tasks performed
- Analyzed available kidney stone imaging datasets for composition labeling quality, resolution constraints, and applicability to endoscopic video
- Defined data selection and preprocessing criteria for in-vivo endoscopic imagery, including normalization and artifact handling
- Designed and trained ML models for kidney stone composition classification using public and synthetic datasets
- Evaluated model performance across input resolutions (128×128 and 256×256) to characterize accuracy sensitivity
- Developed an endoscopic image preprocessing pipeline addressing illumination variability, noise, and scale
- Implemented per-frame inference logic suitable for continuous video stream processing
- Integrated classification models into a video-capable inference pipeline
- Benchmarked frame-level inference behavior under variable image quality conditions
- Documented model architectures, training methodology, limitations, and feasibility findings to support future development planning
Project results
98.8% accuracy (128×128)
Model training and validation on composition-labeled images at low input resolution demonstrated stable classification performance under constrained visual conditions.
85.8% accuracy (256×256)
Higher-resolution inputs were evaluated using the same pipeline, enabling direct comparison of accuracy behavior as spatial detail increased.
Resolution impact quantified
Accuracy deltas between 128×128 and 256×256 inputs were quantified, clarifying compute-versus-performance trade-offs for future system design.
Per-frame inference validated
Inference was executed at the individual frame level, confirming deterministic output behavior compatible with continuous video streams.
Real-time video stream support
The inference pipeline was integrated with a live video-stream interface, demonstrating end-to-end processing from sequential frames without offline buffering assumptions.
Stage II investment risk reduced
Quantified accuracy, resolution trade-offs, and verified streaming behavior eliminated core feasibility unknowns, allowing the client to schedule the next development stage with defined scope.
Value we bring
Translating PoC results into product decisions
PerformaCode frames PoC work around decision points rather than feature validation. By explicitly surfacing constraints, failure modes, and non-obvious dependencies, the work exposes assumptions that would otherwise carry forward unchallenged. This allows teams to correct flawed specifications early instead of discovering them during integration or scale-up.
Turning feasibility data into roadmap inputs
Measured feasibility signals are translated into concrete implications for scope, sequencing, and resourcing. Architectural limits, integration dependencies, and compute trade-offs are made explicit, enabling roadmap decisions that reflect real system behavior rather than optimistic timelines or incomplete assumptions.
Defining go / no-go criteria for ML features
PerformaCode establishes viability thresholds that combine model performance with system-level realities such as streaming behavior, input variability, and downstream integration. This creates a clear basis for pushing back on ML features that are technically interesting but operationally unsound, before they become sunk cost.
Technologies
- Python
- PyTorch
- OpenCV
- NumPy
- CUDA
- Linux
- Git
- Jupyter
Other Case Studies
Edge AI Drone for Orchard Monitoring
Development of a drone-based video analytics system, delivering re...

