Getting Started

This comprehensive guide will walk you through everything you need to know to start extracting data with NextRows API effectively.

Understanding NextRows API

NextRows API is designed to solve the complex challenge of web data extraction by combining:

AI-powered understanding of web page structure and content
Natural language processing to interpret your extraction requirements
Smart content extraction to handle modern websites
Schema validation to ensure data quality and consistency

When to Use NextRows

NextRows excels in scenarios where traditional scraping approaches fall short:

Dynamic websites with JavaScript-rendered content
Complex data structures that require intelligent parsing
One-time or periodic data extraction tasks
Websites that change structure frequently
Data that requires semantic understanding

Core Concepts

Extraction Types

NextRows supports two main extraction approaches:

1. URL Extraction

Extract data directly from web pages by providing URLs:

{
  "type": "url",
  "data": ["https://example.com/page1", "https://example.com/page2"],
  "prompt": "Extract all product information including name, price, and rating"
}

2. Text Extraction

Extract data from raw text content you provide:

{
  "type": "text", 
  "data": ["Product: iPhone 14\nPrice: $999\nRating: 4.5/5"],
  "prompt": "Extract product details in a structured format"
}

Natural Language Prompts

The key to effective extraction is crafting clear, specific prompts:

Good prompts:

"Extract company name, job title, salary range, and location from each job posting"
"Get article title, author, publication date, and full text content"
"Find product name, price, availability status, and customer ratings"

Avoid vague prompts:

"Get all data"
"Extract everything important"
"Find product info"

Schema Validation

For consistent, reliable data, define a schema using JSON Schema:

{
  "type": "url",
  "data": ["https://jobs.example.com"],
  "prompt": "Extract job postings",
  "schema": {
    "type": "array",
    "items": {
      "type": "object",
      "properties": {
        "title": {"type": "string"},
        "company": {"type": "string"},
        "salary": {"type": "string"},
        "location": {"type": "string"},
        "posted_date": {"type": "string"}
      },
      "required": ["title", "company"]
    }
  }
}

Authentication

All requests require authentication using your API key in the Authorization header:

Authorization: Bearer sk-nr-your-api-key-here

Keep your API key secure! Never expose it in client-side code or public repositories.

Making Requests

Basic Request Structure

curl -X POST https://api.nextrows.com/v1/extract \
  -H "Authorization: Bearer sk-nr-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "url",
    "data": ["https://example.com"],
    "prompt": "Your extraction prompt here"
  }'

Response Format

Successful responses return structured data:

{
  "success": true,
  "data": [
    {
      "field1": "value1",
      "field2": "value2"
    }
  ]
}

Error responses include details for troubleshooting:

{
  "success": false,
  "error": "Failed to extract data from URLs"
}

Advanced Features

Multiple URL Processing

Process multiple URLs in a single request:

{
  "type": "url",
  "data": [
    "https://site1.com/page1",
    "https://site1.com/page2", 
    "https://site2.com/products"
  ],
  "prompt": "Extract product information from each page"
}

Error Handling

NextRows handles various error scenarios gracefully:

Partial failures: If some URLs fail, successful extractions are still returned
Timeout protection: Long-running extractions are automatically managed
Rate limiting: Built-in backoff for respectful scraping

Best Practices

Crafting Effective Prompts

Be specific about data fields:

❌ "Get product data"
✅ "Extract product name, price in USD, star rating, and number of reviews"

Specify data format when needed:

✅ "Extract publication date in YYYY-MM-DD format"
✅ "Get price as a number without currency symbols"

Handle edge cases:

✅ "Extract salary range, use 'Not specified' if salary is not mentioned"

Managing Credits Efficiently

Use specific prompts to avoid re-processing unnecessary data
Monitor your credit usage and API performance
Consider batch processing for large datasets

Handling Different Website Types

E-commerce sites:

{
  "prompt": "Extract product name, current price, original price if on sale, rating score, number of reviews, and availability status"
}

Job boards:

{
  "prompt": "Extract job title, company name, location, salary range, experience level, and application deadline"
}

News sites:

{
  "prompt": "Extract article headline, author name, publication date, article text, and tags or categories"
}

Development Workflow

Start with simple prompts to test your extractions
Refine prompts based on initial results
Add schema validation for production consistency
Implement error handling in your application
Monitor performance and optimize as needed

Rate Limits and Scaling

Current Limits

Requests per minute: 20
Maximum URLs per request: 20

Scaling Strategies

For high-volume use cases:

Implement request queuing to handle bursts
Use batch processing to maximize throughput
Optimize request patterns to stay within limits

Next Steps

Features Deep Dive

Explore advanced features like schema validation and data processing.

API Reference

Complete documentation of all endpoints and parameters.

Examples

Real-world examples for common use cases.

Troubleshooting

Solutions for common issues and error messages.

Getting Started

Features Deep Dive

API Reference

Examples

Troubleshooting

On this page