A CORS proxy is a service that allows sites to access resources from other websites, without having to own that website in case when no Access-Control-Allow-Origin header is present on the requested resource.

There are multiple implementations of CORS proxies; but for me, it was interesting to create a proxy without a single line of code.

I came up with two simple solutions.

Requested Resource is Passed as a Part of the Request-URI

The configuration below will handle requests such as https://cors-proxy.example.org/https://google.com/

server {
    listen 443 ssl;
    server_name cors-proxy.example.org;

    ssl_stapling on;
    ssl_stapling_verify on;
    ssl_certificate /etc/letsencrypt/live/example.org/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.org/privkey.pem;
    ssl_trusted_certificate /etc/letsencrypt/live/example.org/chain.pem;

    merge_slashes off;
    resolver 127.0.0.1;

    location ~ ^/(?<pscheme>https?://)(?<phost>[^/]+)(?<ppath>/.*)$ {
        if ($http_origin !~ 'https?://[^.]+\.[^/]+') {
            return 400;
        }

        proxy_http_version 1.1;
        proxy_set_header Host $phost;
        proxy_redirect ~^(https?://)([^/]+)(.*)$ $scheme://$http_host/$1$2$3;
        proxy_pass $pscheme$phost$ppath$is_args$args;
        add_header Access-Control-Allow-Origin * always;
    }
}

The very first lines configure the name of the server (cors-proxy.example.org) and the SSL configuration; you will need to adjust them to match your server name and paths to the SSL key and certificate.

The magic begins with merge_slashes off line.

If you access https://cors-proxy.example.org/https://google.com/, nginx by default will merge consecutive slashes, and the actual it will use location will be in this case /https:/google.com/. merge_slashes off directive turns off this behavior.

Next, we will need a DNS server (I have a DNS resolver on 127.0.0.1:53; you may use the one provided by your hosting company or ISP, but in this case do not forget to update the value of <a href="http://nginx.org/en/docs/http/ngx_http_core_module.html#resolver">resolver</a> directive).

The location directive parses the address into the scheme (pscheme), host (phost), and path (ppath) parts (note: I tried not to use named variables inside the regular expression, but for some reason this did not play nice with if clauses: I got “invalid URL prefix” error in nginx’s error log), and also makes sure that the requested address begins with either http:// or https://.

The CORS specification requires all CORS requests to include an Origin header in every request (this also prevents the use of the proxy for casual browsing); this is what if ($http_origin !~ 'https?://[^.]+\.[^/]+') line does.

proxy_redirect directive is another piece of magic: it makes sure that all redirects are served via our CORS proxy by rewriting scheme://host/request into https://cors-proxy.example.org/scheme://host/request.

Finally, add_header Access-Control-Allow-Origin * always adds Access-Control-Allow-Origin: * header to all responses.

Requested Resource is Passed in the Query String

The configuration below will handle requests such as https://cors-proxy.example.org/?https://google.com/.

server {
    listen 443 ssl;
    server_name cors-proxy.example.org;

    ssl_stapling on;
    ssl_stapling_verify on;
    ssl_certificate /etc/letsencrypt/live/example.org/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.org/privkey.pem;
    ssl_trusted_certificate /etc/letsencrypt/live/example.org/chain.pem;

    resolver 127.0.0.1;

    location = / {
        if ($http_origin !~ 'https?://[^.]+\.[^/]+') {
            return 400;
        }

        set $phost "";
        if ($args ~ '^https?://([^/]+)') {
            set $phost $1;
        }

        if ($phost = "") {
            return 400;
        }

        proxy_http_version 1.1;
        proxy_set_header Host $phost;
        proxy_redirect ~^(https?://)([^/]+)(.*)$ $scheme://$http_host/?$1$2$3;
        proxy_pass $args;
        add_header Access-Control-Allow-Origin * always;
    }
}

This configuration looks pretty much the same as the previous one. The key differences are:

  • We need to extract the host to connect to from the query string. This is performed in if ($args ~ '^https?://([^/]+)'). If we failed to find a host name, we issue a 400 Bad Request response (if ($phost = "")).
  • We pass $args as the argument to proxy_pass, because the entire query string is the URL of the requested resource. We also validated that it begins with http:// or https://.
  • proxy_redirect updated to include ?

Of course, these configurations have a lot of room for improvement, but they still can serve as a starting point for a curious reader.

CORS Proxy by Means of nginx
Tagged on:         

Leave a Reply

Your email address will not be published. Required fields are marked *