Pure regex — no Tree-sitter, no native binaries

21 languages.
Zero install.

Every extractor is a single JS file. No grammar files to download. Runs deterministically on any machine with Node.js 18+.

21Languages
25Max sigs/file
0npm packages
177Tests passing

How extraction works

Input → Signatures

Full source code goes in. Only public shapes come out. Bodies, comments, imports, and private members are stripped entirely.

🟦 TypeScript .ts .tsx
Source file (1,240 tokens)
export interface User {
id: string;
email: string;
createdAt: Date;
}
 
export class UserService {
private db: Database;
 
async findById(id: string):
Promise<User> {
// implementation...
}
 
async create(dto: CreateDto):
Promise<User> { ... }
 
private _validate(d: any) { ... }
}
Signatures (62 tokens)
export interface User
export class UserService
async findById(id: string): Promise<User>
async create(dto: CreateDto): Promise<User>
 
# 95% token reduction
# private _validate stripped
# bodies stripped
# comments stripped

Extraction rules

What's in, what's out

Consistent rules across all 21 languages. Deterministic. Always the same output for the same input.

✓ Extracted
  • Exported / public classes with public methods
  • Exported / public functions and procedures
  • Exported types, interfaces, and enums
  • Internal classes (unexported, lower priority)
  • Internal functions (unexported, lower priority)
  • Method signatures — name, parameters, return type
  • Generic type parameters and constraints
  • Async / await marker where language supports it
✗ Never extracted
  • Function and method bodies (everything inside {})
  • Comments — //, /*, #, """, ''' in all languages
  • Import and require statements
  • Variable declarations that aren't type definitions
  • Private class members (_prefix, # prefix, private keyword)
  • Test files (*.test.*, *.spec.*, *_test.*)
  • Generated files (*.pb.*, *.generated.*)
  • Any credential, key, token, or secret pattern

Reference

All 21 languages

One extractor file per language. Add a language by contributing a file to src/extractors/.

🟦
TypeScript
.ts .tsx
Extracts
export function export class interface type alias enum methods generics
🟨
JavaScript
.js .jsx .mjs .cjs
Extracts
export function export class arrow functions module.exports methods
🐍
Python
.py .pyw
Extracts
def functions class methods async def @dataclass @property
Java
.java
Extracts
public class interface enum public methods annotations
🎯
Kotlin
.kt .kts
Extracts
fun class data class interface object sealed class
🐹
Go
.go
Extracts
func type struct interface method receivers type alias
🦀
Rust
.rs
Extracts
pub fn pub struct trait enum impl methods pub type
🔷
C#
.cs
Extracts
public class interface enum public methods record struct
⚙️
C / C++
.c .cpp .h .hpp .cc
Extracts
functions class / struct public methods template typedef
💎
Ruby
.rb .rake
Extracts
def class module attr_accessor include / extend
🐘
PHP
.php
Extracts
function class interface trait public methods
🍎
Swift
.swift
Extracts
func class / struct protocol enum extension
🎯
Dart
.dart
Extracts
class void / return type functions abstract class mixin methods
🎭
Scala
.scala .sc
Extracts
def class object trait case class methods
💚
Vue
.vue
Extracts
defineProps defineEmits composables component name script functions
🔶
Svelte
.svelte
Extracts
export let props export function script functions component name
🌐
HTML
.html .htm
Extracts
page title h1 — h3 headings form id/action script src link rel
🎨
CSS / SCSS / LESS
.css .scss .sass .less
Extracts
CSS variables (--) @mixin @function media queries top-level selectors
📋
YAML
.yml .yaml
Extracts
top-level keys CI job names K8s kind/name second-level keys
🐚
Shell
.sh .bash .zsh .fish
Extracts
function names exported vars script description
🐳
Dockerfile
Dockerfile Dockerfile.*
Extracts
FROM image EXPOSE ports ENTRYPOINT multi-stage names ARG / ENV keys

Want to add a language?

Contributing is simple

One file per language. Follow the extractor contract. All tests must pass. See CONTRIBUTING.md.

Read contributing guide Quick start →