Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Give agents more agency

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    datadrivenconstruction

    open-construction-estimate

    datadrivenconstruction/open-construction-estimate
    Planning
    7
    1 installs

    About

    SKILL.md

    Install

    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    • Download skill
    ├─
    ├─
    └─

    About

    Access and utilize open construction pricing databases. Match BIM elements to standardized work items, calculate costs using public unit price databases with 55,000+ work items.

    SKILL.md

    Open Construction Estimate

    Overview

    This skill leverages open construction pricing databases for automated cost estimation. Match project elements to standardized work items and calculate costs using publicly available unit prices.

    Data Sources:

    • OpenConstructionEstimate (55,000+ work items)
    • RSMeans Online (subscription)
    • Government pricing databases
    • Regional cost indexes

    "Открытые базы данных расценок содержат более 55,000 позиций работ, что позволяет автоматизировать сметные расчеты для большинства проектов." — DDC LinkedIn

    Quick Start

    import pandas as pd
    from sklearn.feature_extraction.text import TfidfVectorizer
    from sklearn.metrics.pairwise import cosine_similarity
    
    # Load work items database
    work_items = pd.read_csv("open_construction_estimate.csv")
    print(f"Loaded {len(work_items)} work items")
    
    # Simple matching function
    vectorizer = TfidfVectorizer(ngram_range=(1, 2))
    item_vectors = vectorizer.fit_transform(work_items['description'])
    
    def find_matching_items(query, top_n=5):
        query_vec = vectorizer.transform([query])
        similarities = cosine_similarity(query_vec, item_vectors)[0]
        top_indices = similarities.argsort()[-top_n:][::-1]
    
        return work_items.iloc[top_indices][['code', 'description', 'unit', 'unit_price']]
    
    # Find matches
    matches = find_matching_items("reinforced concrete wall 300mm")
    print(matches)
    

    Open Database Structure

    Database Schema

    # Standard work items database structure
    WORK_ITEMS_SCHEMA = {
        'code': 'Work item code (e.g., 03.31.13.13)',
        'description': 'Full description of work',
        'short_description': 'Abbreviated description',
        'unit': 'Unit of measure (m³, m², ton, pcs)',
        'unit_price': 'Base unit price',
        'labor_cost': 'Labor component per unit',
        'material_cost': 'Material component per unit',
        'equipment_cost': 'Equipment component per unit',
        'labor_hours': 'Labor hours per unit',
        'crew_size': 'Typical crew size',
        'productivity': 'Units per day',
        'category_l1': 'Primary category (CSI Division)',
        'category_l2': 'Secondary category',
        'category_l3': 'Detailed category',
        'region': 'Geographic region',
        'year': 'Price year',
        'source': 'Data source'
    }
    
    # CSI MasterFormat Divisions
    CSI_DIVISIONS = {
        '03': 'Concrete',
        '04': 'Masonry',
        '05': 'Metals',
        '06': 'Wood, Plastics, Composites',
        '07': 'Thermal and Moisture Protection',
        '08': 'Openings',
        '09': 'Finishes',
        '10': 'Specialties',
        '21': 'Fire Suppression',
        '22': 'Plumbing',
        '23': 'HVAC',
        '26': 'Electrical',
        '31': 'Earthwork',
        '32': 'Exterior Improvements',
        '33': 'Utilities'
    }
    

    Work Item Matching Engine

    Semantic Matching System

    import pandas as pd
    import numpy as np
    from sklearn.feature_extraction.text import TfidfVectorizer
    from sklearn.metrics.pairwise import cosine_similarity
    from sentence_transformers import SentenceTransformer
    from typing import List, Dict, Optional, Tuple
    import re
    
    class WorkItemMatcher:
        """Match BIM elements to standardized work items"""
    
        def __init__(self, database_path: str, use_embeddings: bool = True):
            self.db = pd.read_csv(database_path)
    
            # TF-IDF for fast initial filtering
            self.tfidf = TfidfVectorizer(
                ngram_range=(1, 3),
                max_features=10000,
                stop_words='english'
            )
            self.tfidf_matrix = self.tfidf.fit_transform(self.db['description'])
    
            # Sentence embeddings for semantic matching
            self.use_embeddings = use_embeddings
            if use_embeddings:
                self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
                self.embeddings = self.embedder.encode(
                    self.db['description'].tolist(),
                    show_progress_bar=True
                )
    
        def match(self, query: str, top_n: int = 5,
                  category: str = None) -> List[Dict]:
            """Find matching work items for a query"""
            # Filter by category if specified
            if category:
                mask = self.db['category_l1'].str.contains(category, case=False, na=False)
                search_db = self.db[mask]
                search_matrix = self.tfidf_matrix[mask]
            else:
                search_db = self.db
                search_matrix = self.tfidf_matrix
    
            if self.use_embeddings:
                return self._semantic_match(query, search_db, top_n)
            else:
                return self._tfidf_match(query, search_db, search_matrix, top_n)
    
        def _tfidf_match(self, query: str, db: pd.DataFrame,
                         matrix, top_n: int) -> List[Dict]:
            """TF-IDF based matching"""
            query_vec = self.tfidf.transform([query])
            similarities = cosine_similarity(query_vec, matrix)[0]
    
            top_indices = similarities.argsort()[-top_n:][::-1]
    
            results = []
            for idx in top_indices:
                row = db.iloc[idx]
                results.append({
                    'code': row['code'],
                    'description': row['description'],
                    'unit': row['unit'],
                    'unit_price': row['unit_price'],
                    'similarity': float(similarities[idx]),
                    'category': row.get('category_l1', '')
                })
    
            return results
    
        def _semantic_match(self, query: str, db: pd.DataFrame,
                            top_n: int) -> List[Dict]:
            """Semantic embedding based matching"""
            query_embedding = self.embedder.encode([query])
    
            # Get indices for filtered db
            indices = db.index.tolist()
            filtered_embeddings = self.embeddings[indices]
    
            similarities = cosine_similarity(query_embedding, filtered_embeddings)[0]
            top_indices = similarities.argsort()[-top_n:][::-1]
    
            results = []
            for i, idx in enumerate(top_indices):
                row = db.iloc[idx]
                results.append({
                    'code': row['code'],
                    'description': row['description'],
                    'unit': row['unit'],
                    'unit_price': row['unit_price'],
                    'similarity': float(similarities[idx]),
                    'category': row.get('category_l1', '')
                })
    
            return results
    
        def match_bim_element(self, element: Dict) -> List[Dict]:
            """Match a BIM element to work items"""
            # Build query from element properties
            query_parts = []
    
            if element.get('material'):
                query_parts.append(element['material'])
            if element.get('category'):
                query_parts.append(element['category'])
            if element.get('description'):
                query_parts.append(element['description'])
    
            # Add dimensions if available
            if element.get('thickness'):
                query_parts.append(f"{element['thickness']}mm thick")
            if element.get('height'):
                query_parts.append(f"{element['height']}m high")
    
            query = ' '.join(query_parts)
    
            # Determine category from element type
            category = self._get_category_from_element(element)
    
            return self.match(query, top_n=3, category=category)
    
        def _get_category_from_element(self, element: Dict) -> Optional[str]:
            """Map BIM element type to CSI category"""
            element_mapping = {
                'IfcWall': 'Concrete|Masonry',
                'IfcSlab': 'Concrete',
                'IfcColumn': 'Concrete|Metals',
                'IfcBeam': 'Concrete|Metals',
                'IfcDoor': 'Openings',
                'IfcWindow': 'Openings',
                'IfcRoof': 'Thermal',
                'IfcStair': 'Concrete',
                'IfcPipeSegment': 'Plumbing',
                'IfcDuctSegment': 'HVAC'
            }
    
            elem_type = element.get('ifc_type', '')
            return element_mapping.get(elem_type)
    

    Cost Estimation Engine

    Automated Estimator

    class OpenConstructionEstimator:
        """Generate cost estimates using open databases"""
    
        def __init__(self, matcher: WorkItemMatcher, region: str = 'default'):
            self.matcher = matcher
            self.region = region
            self.regional_factors = self._load_regional_factors()
            self.estimates = []
    
        def _load_regional_factors(self) -> Dict[str, float]:
            """Load regional cost adjustment factors"""
            return {
                'default': 1.0,
                'northeast_us': 1.15,
                'southeast_us': 0.92,
                'midwest_us': 0.95,
                'west_us': 1.08,
                'moscow': 1.20,
                'spb': 1.10,
                'regions_ru': 0.85
            }
    
        def estimate_element(self, element: Dict) -> Dict:
            """Estimate cost for a single element"""
            # Get matching work items
            matches = self.matcher.match_bim_element(element)
    
            if not matches:
                return {
                    'element_id': element.get('id'),
                    'status': 'no_match',
                    'estimated_cost': 0
                }
    
            best_match = matches[0]
            quantity = element.get('quantity', 1)
            unit_price = best_match['unit_price']
    
            # Apply regional factor
            regional_factor = self.regional_factors.get(self.region, 1.0)
            adjusted_price = unit_price * regional_factor
    
            # Calculate total
            total_cost = adjusted_price * quantity
    
            estimate = {
                'element_id': element.get('id'),
                'element_type': element.get('ifc_type'),
                'element_description': element.get('description', ''),
                'matched_code': best_match['code'],
                'matched_description': best_match['description'],
                'match_confidence': best_match['similarity'],
                'unit': best_match['unit'],
                'quantity': quantity,
                'unit_price': unit_price,
                'regional_factor': regional_factor,
                'adjusted_unit_price': adjusted_price,
                'total_cost': total_cost
            }
    
            self.estimates.append(estimate)
            return estimate
    
        def estimate_project(self, elements: List[Dict]) -> Dict:
            """Estimate entire project"""
            for element in elements:
                self.estimate_element(element)
    
            df = pd.DataFrame(self.estimates)
    
            # Summary by category
            if not df.empty:
                summary = df.groupby('element_type').agg({
                    'total_cost': 'sum',
                    'element_id': 'count',
                    'match_confidence': 'mean'
                }).rename(columns={'element_id': 'count'})
            else:
                summary = pd.DataFrame()
    
            total = df['total_cost'].sum() if not df.empty else 0
    
            return {
                'total_cost': total,
                'element_count': len(elements),
                'matched_count': len(df[df['match_confidence'] > 0.5]) if not df.empty else 0,
                'summary_by_type': summary.to_dict() if not summary.empty else {},
                'details': self.estimates
            }
    
        def export_estimate(self, output_path: str) -> str:
            """Export estimate to Excel"""
            df = pd.DataFrame(self.estimates)
    
            with pd.ExcelWriter(output_path, engine='openpyxl') as writer:
                # Summary
                summary = pd.DataFrame({
                    'Metric': ['Total Cost', 'Elements', 'Matched', 'Avg Confidence'],
                    'Value': [
                        df['total_cost'].sum() if not df.empty else 0,
                        len(df),
                        len(df[df['match_confidence'] > 0.5]) if not df.empty else 0,
                        df['match_confidence'].mean() if not df.empty else 0
                    ]
                })
                summary.to_excel(writer, sheet_name='Summary', index=False)
    
                # Details
                if not df.empty:
                    df.to_excel(writer, sheet_name='Details', index=False)
    
                    # By type
                    by_type = df.groupby('element_type')['total_cost'].sum()
                    by_type.to_excel(writer, sheet_name='By_Type')
    
            return output_path
    
        def get_missing_items(self) -> List[Dict]:
            """Get elements that couldn't be matched"""
            df = pd.DataFrame(self.estimates)
            if df.empty:
                return []
    
            low_confidence = df[df['match_confidence'] < 0.5]
            return low_confidence.to_dict('records')
    

    Database Management

    Creating and Updating Database

    class OpenDatabaseManager:
        """Manage open construction pricing database"""
    
        def __init__(self, db_path: str):
            self.db_path = db_path
            self.db = self._load_or_create()
    
        def _load_or_create(self) -> pd.DataFrame:
            """Load existing or create new database"""
            try:
                return pd.read_csv(self.db_path)
            except FileNotFoundError:
                return pd.DataFrame(columns=list(WORK_ITEMS_SCHEMA.keys()))
    
        def add_items(self, items: List[Dict]):
            """Add new work items"""
            new_df = pd.DataFrame(items)
            self.db = pd.concat([self.db, new_df], ignore_index=True)
            self.db.drop_duplicates(subset=['code'], keep='last', inplace=True)
    
        def update_prices(self, updates: pd.DataFrame, year: int):
            """Update prices with new data"""
            for _, row in updates.iterrows():
                mask = self.db['code'] == row['code']
                if mask.any():
                    self.db.loc[mask, 'unit_price'] = row['unit_price']
                    self.db.loc[mask, 'year'] = year
    
        def apply_inflation(self, rate: float):
            """Apply inflation adjustment"""
            self.db['unit_price'] = self.db['unit_price'] * (1 + rate)
    
        def export_subset(self, category: str, output_path: str):
            """Export subset of database"""
            subset = self.db[
                self.db['category_l1'].str.contains(category, case=False, na=False)
            ]
            subset.to_csv(output_path, index=False)
    
        def save(self):
            """Save database"""
            self.db.to_csv(self.db_path, index=False)
    
        def get_statistics(self) -> Dict:
            """Get database statistics"""
            return {
                'total_items': len(self.db),
                'categories': self.db['category_l1'].nunique(),
                'avg_price': self.db['unit_price'].mean(),
                'price_range': (self.db['unit_price'].min(), self.db['unit_price'].max()),
                'latest_year': self.db['year'].max() if 'year' in self.db else None
            }
    

    Quick Reference

    Category CSI Division Typical Items
    Concrete 03 Walls, slabs, columns, beams
    Masonry 04 Brick, block, stone
    Metals 05 Structural steel, misc metals
    Finishes 09 Drywall, paint, flooring
    MEP 21-26 Plumbing, HVAC, electrical
    Sitework 31-33 Excavation, paving, utilities

    Resources

    • OpenConstructionEstimate: Open database initiative
    • CSI MasterFormat: https://www.csiresources.org/standards/masterformat
    • DDC Website: https://datadrivenconstruction.io

    Next Steps

    • See vector-search for semantic item matching
    • See cost-prediction for ML-based estimation
    • See qto-report for quantity extraction
    Recommended Servers
    The Local Intel
    The Local Intel
    PlanetScale
    PlanetScale
    LastLook Data
    LastLook Data
    Repository
    datadrivenconstruction/ddc_skills_for_ai_agents_in_construction
    Files