404 Watch API

404 Watch API

Link checker API with advanced features and reporting capabilities.
Free Plan $0.00 Monthly Subscribe
10 Requests / Monthly
Free for Lifetime
No Credit Card Required
MOST POPULAR
Starter Plan $9.99 Monthly Subscribe
3,000 Requests / Monthly
Standard Support
Highly configurable
Callback functionality
Can check asset files
Can check external links
Pro Plan $39.99 Monthly Subscribe
15,000 Requests / Monthly
Standard Support
Highly configurable
Callback functionality
Can check asset files
Can check external links
Custom Plan Volume Monthly Contact Us
Any requests volume you need
Highly configurable
Callback functionality
Can check asset files
Can check external links

Broken links are not only bad from the user's view, but also damages your site's search engine visibility, as Google discourages broken links and downgrades your SEO reputation accordingly. You should avoid links to broken content and also avoid having pages on your site that are not working.

404 Watch API is one of the most powerful link checkers in the market with advanced features as below.

  • Optionally can respect nofollow attributes on hrefs
  • Can check external links
  • Optionally can discard query parameters
  • Optionally can discard hash parameters
  • Can check images/js and css files for broken links
  • Can whitelist and exclude domains from checking
  • Can trigger a callback URL, when the link checking is done, or you may optionally poll the results for results.

Whitelisting domains

You may have multiple domains that is served from a single site, and need to treat them as internal resources. So you may add the multiple domains to thw whitelisted domains while creating the link checker job and you're done.

Excluding domains and URLs from checking

You may wish to exclude some specific domains from link checking for any reason. You just need to add the domain names to the excluded_domains_list or excluded_urls_list list variables while creating the link checker job

Callback when finished

Link checking is a time consuming job. As you may have multiple hundreds of URLs (and asssets) on your site. So we have chosen an asyncronous approach. You create a link checking job using the POST /job endpoint and get an ID in response. You may wish to poll the GET /job/{id} endpoint for watching out the ongoing link checking process or get the results for finished ones.

Optinally you may also provide a callback URL when creating the link checker job via POST /job endpoint.Doing so, you don't need to poll the GET /job endpoint for results, as the API will call provided callback URL (via HTTP POST) automatically when the process has finished.

Optinally you may also provide a callback_security variable when creating the link checker job. This variable will be placed in the HTTP header using the X-Callback-Secret header. You may check this header for authentication purposes.

Sample Request for creating a new link checker job

Below is a sample request that is done to the POST /job endpoint. It contains many of the config parameters for optimizing the link checking process.

curl --location --request POST 'https://api.apilayer.com/404_watch/job' \
--header 'Content-Type: application/json' \
--header 'apikey: YOUR API KEY' \
--data-raw '{
  "url": "https://p1.rs",
  "levels": 2,
  "fetch_external": false,
  "check_images": false,
  "check_css": true,
  "callback": "https://mydomain.com/callback",
  "callback_secret": "supersecret_key",
  "check_js": true,
  "whitelisted_domains_list": [
      "apilayer.com"
  ]
}'

When called you'll get a response such as below:

{
    "id": "c3f7b23e-a239-4af4-b9ec-698a3a6d0a21"
}

You can use this id for querying the results via GET /job/{id} endpoint.

curl --location --request GET 'https://api.apilayer.com/404_watch/job/c3f7b23e-a239-4af4-b9ec-698a3a6d0a21' \
--header 'apikey: YOUR API KEY'

The response contains comprehensive details about the ongoing process and the results. See below:

{
    "id": "c3f7b23e-a239-4af4-b9ec-698a3a6d0a21",
    "created_at": 1609681539,
    "status": "finished",
    "url": "https://apilayer.com",
    "progress": {
        "discovered": 117,
        "checked": 117,
        "percentage": 100.0
    },
    "status_codes": {
        "503": 4,
        "200": 110
    },
    "content_types": {
        "image/svg+xml": 10,
        "image/png": 25,
        "text/css": 6,
        "text/html": 60,
        "image/jpeg": 5,
        "application/javascript": 8,
        "application/x-javascript": 1
    },
    "options": {
        "callback_secret": null,
        "check_css": true,
        "max_levels": 3,
        "check_js": true,
        "max_links": 1000,
        "excluded_domains_list": [],
        "fetch_nofollow": false,
        "excluded_urls_list": [],
        "fetch_external": true,
        "whitelisted_domains_list": "assets.apilayer.com",
        "omit_query_params": false,
        "callback": null,
        "omit_hash_params": true,
        "check_images": true
    }
}

Date variables above (created_at) are timestamps.

Getting the details for each link that is checked

If you wish to get all the links that is discovered and been checked using the GET /job/{id}/links endpoint. See the following example.

curl --location --request GET 'https://api.apilayer.com/404_watch/job/c3f7b23e-a239-4af4-b9ec-698a3a6d0a21/links' \
--header 'apikey: YOUR KEY'

The response contains all the links as well as the details for content types and http status codes. You may filter and use it the way you desire.

{
    "job_id": "d0de484e-c18f-4ee8-b84e-4ba63907e283",
    "status": "finished",
    "created_at": 1609681539,
    "links": [
        {
            "url": "https://apilayer.com",
            "content_type": "text/html",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609681563
        },
        {
            "url": "https://assets.apilayer.com/apis/image_similarity.png",
            "content_type": "image/png",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609681578
        },
        {
            "url": "https://apilayer.com/marketplace/description/textgears-api",
            "content_type": "text/html",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609681592
        },
        {
            "url": "https://apilayer.com/marketplace/category/text-processing-apis",
            "content_type": "text/html",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746185
        },
        {
            "url": "https://js.hs-scripts.com/7564526.js",
            "content_type": "application/javascript",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746223
        },
        {
            "url": "https://apilayer.com/marketplace/tag/spelling",
            "content_type": "text/html",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746233
        },
        {
            "url": "https://apilayer.com/marketplace/tag/text-tools",
            "content_type": "text/html",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746256
        },
        {
            "url": "https://textgears.com/assets/img/logos/apple/120.png",
            "content_type": "image/png",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746270
        },
        {
            "url": "https://apilayer.com/assets/css/documentation.css?6",
            "content_type": "text/css",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746374
        },
        {
            "url": "https://apilayer.com/assets/js/marketplace/marketplace.js?52",
            "content_type": "application/javascript",
            "is_timeout": false,
            "http_status": 200,
            "fetched_at": 1609746577
        }
    ],
    "query": {
        "limit": 10,
        "offset": 0,
        "page": 0,
        "total_count": 117
    }
}

404 Watch API Reference

This API is organized around REST. Our API has predictable resource-oriented URLs, accepts form-encoded request bodies, returns JSON-encoded responses, and uses standard HTTP response codes, authentication, and verbs.

Just Getting Started?

Check out our development quickstart guide.

Authentication

404 Watch API uses API keys to authenticate requests. You can view and manage your API keys in the Accounts page.

Your API keys carry many privileges, so be sure to keep them secure! Do not share your secret API keys in publicly accessible areas such as GitHub, client-side code, and so forth.

All requests made to the API must hold a custom HTTP header named "apikey". Implementation differs with each programming language. Below are some samples.

All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests without authentication will also fail.

Endpoints

Get link checker job details and summary report by Id

Parameters

id (required)

Job Id to get progress and summary report

Location: Path, Data Type: string

** A word enclosed with curly brackets "{ }" in the code means that it is a parameter and it should be replaced with your own values when executing. (also overwriting the curly brackets).
Returns

Below is a sample response from the endpoint


If you wish to play around interactively with real values and run code, see...

Full list of detected links and their validation statuses

Parameters

id (required)

Job id

Location: Path, Data Type: string

filter_content_type (optional)

Content Type to filter results (URL Encoded)

Location: Query, Data Type: string

filter_status (optional)

HTTP Status code to filter results

Location: Query, Data Type: integer

limit (optional)

How many items should be fetched in a page

Location: Query, Data Type: integer

page (optional)

0 based page parameter that is used for pagination.

Location: Query, Data Type: integer

** A word enclosed with curly brackets "{ }" in the code means that it is a parameter and it should be replaced with your own values when executing. (also overwriting the curly brackets).
Returns

Below is a sample response from the endpoint


If you wish to play around interactively with real values and run code, see...

Gets a summary of all the link checker jobs

Parameters
No parameters.
Returns

Below is a sample response from the endpoint


If you wish to play around interactively with real values and run code, see...

Creates a new link checker job

Parameters

body (required)

Fetch parameters for link checker job

  • - url: (required) URL to start checking.
  • - callback: If provided, this callback URL will be called (HTTP POST) when link checking is finished.
  • - callback_secret: If callback URL is provided, this secret will be sent with the X-Callback-Secret header param.
  • - max_levels: How many levels should link checker go. (default: 3)
  • - max_links: Hard limit for discovery of links. (default: 5000)
  • - fetch_nofollow: Should we respect the rel="nofollow" attributes with hrefs? If set to true, links with nofollow flag will be fetched. Otherwise not. (default: false)
  • - fetch_external: Should we fetch external links? (default: false)
  • - omit_query_params: Should we trim the query parameters from the URL before fetching it? If set to true http://mylink.com/path?a=1 will be fetched as http://mylink.com/path (default: false)
  • - omit_hash_params: Should we trim the inpage links (hash parameters)from the URL before fetching it? If set to true http://mylink.com/path#abc will be fetched as http://mylink.com/path (default: true)
  • - check_images: Should we check images for broken links? (default: false)
  • - check_css: Should we check CSS files for broken links? (default: false)
  • - check_js: Should we check JS files for broken links? (default: false)
  • - excluded_domains_list: A list containing domain names that should not be checked. (example: ["sampledomain.com", "xyz.sampledomain.com"])
  • - excluded_urls_list: A list containing full URLs that should not be checked. (example: ["https://sampledomain.com/a.html", "http://xyz.sampledomain.com/b.png"])
  • - whitelisted_domains_list: A list of domain names that should be treated as local (not external) (example: ["www.sampledomain.com", "assets.sampledomain.com"])

Example:

{ "url": "https://p1.rs", "levels": 1, "fetch_external": false, "check_images": false, "check_css": true, "check_js": true, "whitelisted_domains_list": [ "promptapi.com" ] }

Location: Body, Data Type: JSON

** A word enclosed with curly brackets "{ }" in the code means that it is a parameter and it should be replaced with your own values when executing. (also overwriting the curly brackets).
Returns

Below is a sample response from the endpoint


If you wish to play around interactively with real values and run code, see...

Rate Limiting

Each subscription has its own rate limit. When you become a member, you start by choosing a rate limit that suits your usage needs. Do not worry; You can upgrade or downgrade your plan at any time. For this reason, instead of starting with a larger plan that you do not need, we can offer you to upgrade your plan after you start with "free" or "gold plan" options and start using the API.

When you reach a rate limit (both daily and monthly), the service will stop responding and returning the HTTP 429 response status code (Too many requests) for each request with the following JSON string body text.

{
"message":"You have exceeded your daily\/monthly API rate limit. Please review and upgrade your subscription plan at https:\/\/apilayer.com\/subscriptions to continue."
}

A reminder email will be sent to you when your API usage reaches both 80% and 90%, so that you can take immediate actions such as upgrading your plan in order to prevent your application using the API from being interrupted.

You can also programmatically check your rate limit yourself. As a result of each request made to the APILayer, the following 4 fields provide you with all the necessary information within the HTTP Headers.

x-ratelimit-limit-month: Request limit per month
x-ratelimit-remaining-month: Request limit remaining this month
x-ratelimit-limit-day: Request limit per day
x-ratelimit-remaining-day: Request limit remaining today

You can contact our support unit if you need any assistance with your application regarding to handle the returned result by looking at the header information.

Error Codes

APILayer uses standard HTTP response codes to indicate the success or failure of an API request. In general: Codes in the 2xx range indicate success. Codes in the 4xx range indicate a clientside error, which means that failed given the information provided (e.g., a missing parameter, unauthorized access etc.). Codes in the 5xx range indicate an error with APILayer's servers (normally this should'nt happen at all).

If the response code is not 200, it means the operation failed somehow and you may need to take an action accordingly. You can check the response (which will be in JSON format) for a field called 'message' that briefly explains the error reported.

Status Code Explanation
400 - Bad Request The request was unacceptable, often due to missing a required parameter.
401 - Unauthorized No valid API key provided.
404 - Not Found The requested resource doesn't exist.
429 - Too many requests API request limit exceeded. See section Rate Limiting for more info.
5xx - Server Error We have failed to process your request. (You can contact us anytime)

You can always contact for support and ask for more assistance. We'll be glad to assist you with building your product.