DEV Community

Cover image for Introducing kotoba v0.0.1: Natural Language Web Testing with 6x Speed Improvement
kaz
kaz

Posted on

Introducing kotoba v0.0.1: Natural Language Web Testing with 6x Speed Improvement

kotoba - Natural Language Web Testing

Introduction

On June 20th, 2025, we released kotoba v0.0.1, a natural language web testing tool with groundbreaking performance improvements. This article details our technical approach achieving 6x speed improvement through a staged fallback strategy and 203 pattern matching rules in our assertion system implementation.

What is kotoba?

Kotoba is a Python tool that enables web testing through natural language instructions. By combining Playwright with LLMs, it automates browser interactions using intuitive commands like:

Click the "Login" button
Enter "[email protected]" in the email field
Enter "password123" in the password field
Click the "Submit" button
Verify that "Login successful" message is displayed
Enter fullscreen mode Exit fullscreen mode

The Challenge: LLM Processing Bottleneck

The biggest challenge in natural language testing tools is processing speed. When all instructions are processed through LLM:

  • Average processing time: 1.1-1.6 seconds per instruction
  • Cause: LLM inference processing required
  • Impact: Massive execution time for large test suites

To solve this challenge, we adopted a strategy of pre-defining frequent patterns to minimize LLM dependency.

Our Solution: Staged Fallback Strategy

1. Architecture Design

We implemented a two-stage processing flow in kotoba:

Natural Language Instruction
    ↓
【Stage 1】Assertion Pattern Matching (< 1ms)
    ↓ (match found)
✅ Execute Assertion
    ↓ (no match)
【Stage 2】LLM-based General Action Processing (100-1000ms)
    ↓
🎯 Execute Browser Action
Enter fullscreen mode Exit fullscreen mode

2. 25 Assertion Types

We implemented comprehensive assertion types for thorough test validation:

class AssertionType(Enum):
    # Text-related assertions
    TEXT_EXISTS = "text_exists"
    TEXT_NOT_EXISTS = "text_not_exists"
    TEXT_EQUALS = "text_equals"
    TEXT_CONTAINS = "text_contains"

    # Element-related assertions
    ELEMENT_EXISTS = "element_exists"
    ELEMENT_VISIBLE = "element_visible"
    ELEMENT_ENABLED = "element_enabled"

    # URL/Title assertions
    URL_CONTAINS = "url_contains"
    TITLE_CONTAINS = "title_contains"

    # Form assertions
    INPUT_VALUE_EQUALS = "input_value_equals"
    CHECKBOX_CHECKED = "checkbox_checked"
    # ... 13 more types
Enter fullscreen mode Exit fullscreen mode

3. 203 Pattern Matching Rules

To handle natural language diversity, we implemented 203 patterns across multiple categories:

Basic Japanese Patterns

# Text existence verification
(r"[「""]?(.+?)[」""]?が(?:表示されて|出て|見えて)?(?:いる|いること|いることを)(?:確認|チェック|検証)", 
 AssertionType.TEXT_EXISTS, "text"),

# Polite form support
(r"[「""]?(.+?)[」""]?が(?:表示されて|出て|見えて)?(?:いる|いること|いることを)(?:確認|チェック|検証)(?:します|してください|お願いします)", 
 AssertionType.TEXT_EXISTS, "text"),
Enter fullscreen mode Exit fullscreen mode

Colloquial and Question Forms

# Kansai dialect
(r"[「""]?(.+?)[」""]?(?:が|って)(?:出てる|見える)(?:で|やん|やんか|わ|な)(?:?|\?)?", 
 AssertionType.TEXT_EXISTS, "text"),

# Question forms
(r"[「""]?(.+?)[」""]?(?:は|が)(?:表示されて|見えて)(?:いますか|いるでしょうか|いるか)(?:?|\?)?", 
 AssertionType.TEXT_EXISTS, "text"),
Enter fullscreen mode Exit fullscreen mode

English and Chinese Patterns

# English patterns
(r"(?:verify|check|confirm|assert)(?:\s+that)?\s+[\"'](.+?)[\"']\s+(?:is|exists?|appears?|is\s+(?:visible|displayed|present))", 
 AssertionType.TEXT_EXISTS, "text"),

# Chinese patterns
(r"(?:验证|检查|确认|断言)[\"'""](.+?)[\"'""](?:存在|显示|出现)", 
 AssertionType.TEXT_EXISTS, "text"),
Enter fullscreen mode Exit fullscreen mode

Domain-Specific Patterns

# IT/Web industry
(r"[「""]?(.+?)[」""]?(?:が|の)(?:レンダリング|描画|出力)(?:が|は)(?:正常|適切|問題なし)(?:か|かどうか)(?:確認|検証|チェック)", 
 AssertionType.TEXT_EXISTS, "text"),

# Form elements
(r"[「""]?(.+?)[」""]?(?:ボタン|button|Button|按钮|按鈕)(?:が|は)(?:表示|存在|クリック可能|押せる|有効)(?:になって|で|に)(?:いる|いること)(?:を|について)?(?:確認|検証|チェック)", 
 AssertionType.ELEMENT_EXISTS, "button"),
Enter fullscreen mode Exit fullscreen mode

Technical Implementation Details

Pattern Matching Engine

class AssertionPatternMatcher:
    @classmethod
    def parse(cls, instruction: str) -> Optional[Dict[str, Any]]:
        for pattern, assertion_type, param_type in cls.PATTERNS:
            match = re.search(pattern, instruction)
            if match:
                # Flexible quote handling
                text = match.groups()[0].strip()
                text = text.strip('「」""''\'\'""')
                return {
                    "type": assertion_type,
                    "expected": text,
                    "selector": None
                }
        return None  # Fall back to LLM
Enter fullscreen mode Exit fullscreen mode

Assertion Execution Engine

class AssertionExecutor:
    async def execute(self, assertion: Assertion) -> AssertionResult:
        start_time = time.time()

        try:
            if assertion.type == AssertionType.TEXT_EXISTS:
                # Multiple selector strategy for robustness
                selectors = [
                    f"text={assertion.expected}",
                    f"text=*{assertion.expected}*",
                    f":has-text('{assertion.expected}')"
                ]

                for selector in selectors:
                    elements = await self.page.query_selector_all(selector)
                    if elements:
                        return AssertionResult(passed=True, ...)

        except Exception as e:
            return AssertionResult(passed=False, error_message=str(e))
Enter fullscreen mode Exit fullscreen mode

Performance Improvement Results

Dramatic Processing Time Reduction

Pattern Count LLM Usage Avg Processing Time Speed Improvement
54 (initial) 30% 300ms Baseline
130 (interim) 10% 100ms 3x faster
203 (current) 5% 50ms 6x faster
500 (target) 1% 10ms 30x faster

Test Success Rate Enhancement

  • Success Rate: 100% (6/6 test cases)
  • Error Handling: Robust fallback mechanisms
  • Internationalization: Japanese, English, Chinese support

Real-World Usage Examples

YAML Test Cases

name: "Assertion Function Test"
base_url: "https://example.com"

test_cases:
  - name: "Basic Text Verification"
    steps:
      - instruction: "Verify that 'Example Domain' is displayed"
      - instruction: "Check that URL contains example.com"

  - name: "Colloquial Expression Test"
    steps:
      - instruction: "Example Domainって表示されてる?"
      - instruction: "Can you see Example Domain?"
Enter fullscreen mode Exit fullscreen mode

Execution Results

{
  "assertion_result": {
    "type": "text_exists",
    "passed": true,
    "expected": "Text 'Example Domain' exists 1 or more times",
    "actual": 1,
    "execution_time_ms": 6.17
  }
}
Enter fullscreen mode Exit fullscreen mode

Pattern Categories (23 categories, 203 patterns)

Our comprehensive pattern coverage includes:

  1. Form Element Patterns - Buttons, links, input fields, select boxes
  2. Status & State Patterns - Loading, errors, success, warnings
  3. Media Patterns - Images, videos, icons
  4. Table & List Patterns - Table data, list items, counts
  5. Modal & Popup Patterns - Modals, dialogs, alerts, notifications
  6. Navigation Patterns - Menus, tabs, navigation
  7. Accessibility Patterns - ARIA, focus, screen readers
  8. Responsive & Device Patterns - Mobile, responsive design
  9. Performance Patterns - Speed, response time
  10. Security Patterns - HTTPS, SSL, secure connections
  11. Special Character Patterns - Symbols, required marks
  12. Date & Time Patterns - Dates, times, current time
  13. Price & Amount Patterns - Prices, totals, currency
  14. Count & Number Patterns - Counts, remaining items
  15. Login & Auth Patterns - Login status, user info
  16. Download & Upload Patterns - File operations
  17. Progress Patterns - Progress, progress bars
  18. Message & Text Patterns - Info, hints
  19. Validation Patterns - Validation errors, validity
  20. Sort & Filter Patterns - Ascending, descending, order
  21. Search & Filter Patterns - Filters, search results
  22. Pagination Patterns - Page numbers, next/previous
  23. Language & i18n Patterns - Language switching, localization

Future Roadmap

Phase 2: Machine Learning-Assisted Pattern Generation

  • Automatic pattern extraction from log data
  • Dynamic optimization based on usage frequency

Phase 3: Ultimate Speed Optimization

  • Implementation of 500+ patterns
  • Achieving sub-millisecond processing times
  • Community-driven pattern database construction

Conclusion

Our assertion system implementation in kotoba achieved:

  1. 6x Speed Improvement: 300ms → 50ms
  2. 203 Patterns: Comprehensive natural language support
  3. 100% Test Success Rate: Robust error handling
  4. Multilingual Support: Japanese, English, Chinese

This work demonstrates new possibilities in the convergence of natural language processing and web test automation. By combining pattern matching with LLM, we've successfully balanced ease of use with high performance.

kotoba v0.0.1 was released on June 20th, 2025, and is available as open source. We continue our pursuit of becoming the world's highest-performance natural language testing tool through ongoing improvements.


GitHub: kotoba

Release Date: June 20th, 2025 (v0.0.1)

Tech Stack: Python, Playwright, LLM, Regex

Tags: #testing #automation #nlp

Top comments (0)