Skip to content

Getting Started Guide

This guide will help you get started with the ParseMyFile API.

Overview

The ParseMyFile API allows you to process PDF, DOCX, XLSX documents and images to extract structured data in JSON format. It uses a custom YAML configuration file to define the fields to extract.

Prerequisites

  • A valid API key
  • A file to process (PDF, DOCX, XLSX or image)
  • A YAML configuration file
  • An HTTP client (cURL, Postman, or code in your preferred language)

Installation and Configuration

1. Get an API Key

To get an API key, create an account on the website: https://parsemyfile.com. Once logged in, go to the "API Keys" section. Create a key or use an existing valid key. This key will be required for all API requests.

2. Prepare your files

File to process

Criteria depend on your subscription

  • Supported formats: PDF, JPG, PNG, JPEG, XLSX, DOCX
  • Maximum size: 1-10 MB
  • Recommended quality: 300 DPI minimum

YAML configuration file

Create a YAML file describing the fields to extract.

Here's an example:

yaml
schemas:
  data:
    type: object
    properties:
      name:
        type: string
        description: client name
      email:
        type: string
        description: client email address
      phone:
        type: string
        description: client phone number
      amount:
        type: double
        description: total invoice amount

First API Call

With cURL

bash
curl -X POST "https://api.parsemyfile.com/api/v1/generate" \
  -H "X-API-KEY: your_api_key_here" \
  -F "file=@my_document.pdf" \
  -F "yaml_file=@my_configuration.yaml"

With JavaScript

javascript
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('yaml_file', yamlFileInput.files[0]);

const response = await fetch('https://api.parsemyfile.com/api/v1/generate', {
  method: 'POST',
  headers: {
    'X-API-KEY': 'your_api_key_here'
  },
  body: formData
});

const result = await response.json();
console.log(result);

With Python

python
import requests

url = "https://api.parsemyfile.com/api/v1/generate"
headers = {"X-API-KEY": "your_api_key_here"}

files = {
    'file': ('document.pdf', open('document.pdf', 'rb'), 'application/pdf'),
    'yaml_file': ('configuration.yaml', open('configuration.yaml', 'rb'), 'text/yaml')
}

response = requests.post(url, headers=headers, files=files)
result = response.json()
print(result)

API Health Check

Before processing your documents, you can verify that the API is working correctly:

bash
curl -X GET "https://api.parsemyfile.com/health"

Expected response:

json
{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00Z",
  "version": "1.0.0"
}

Next Steps

ParseMyFile API Documentation