DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules Paper • 2605.08614 • Published 16 days ago • 7
Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds Paper • 2605.18827 • Published 13 days ago • 6
Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines Paper • 2605.20630 • Published 5 days ago • 12
Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines Paper • 2605.20630 • Published 5 days ago • 12
Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines Paper • 2605.20630 • Published 5 days ago • 12
Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds Paper • 2605.18827 • Published 13 days ago • 6
SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search Paper • 2512.23167 • Published Dec 29, 2025 • 1
IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance Paper • 2604.23446 • Published about 1 month ago • 4
MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments Paper • 2605.09131 • Published 16 days ago • 55
Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge Paper • 2605.08518 • Published 17 days ago • 10
SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks Paper • 2605.14051 • Published 12 days ago • 1
DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules Paper • 2605.08614 • Published 16 days ago • 7
DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules Paper • 2605.08614 • Published 16 days ago • 7
rohithkanathur/assetopsbench-transformer-scenarios-dataset Viewer • Updated 16 days ago • 50 • 87 • 1
MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments Paper • 2605.09131 • Published 16 days ago • 55 • 2
SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks Paper • 2605.14051 • Published 12 days ago • 1
SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks Paper • 2605.14051 • Published 12 days ago • 1
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 18 items • Updated 5 days ago • 17
Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge Paper • 2605.08518 • Published 17 days ago • 10
Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge Paper • 2605.08518 • Published 17 days ago • 10