Most of the protocols you use with curl speak TCP. With TCP, a client such as curl must first figure out the IP address(es) of the host you want to communicate with, then connect to it. "Connecting to it" means performing a TCP protocol handshake.
For ordinary command line usage, operating on a URL, these are details which are taken care of under the hood, and which you can mostly ignore. But at times you might find yourself wanting to tweak the specifics…
Name resolve tricks
Edit the hosts file
Maybe you want the command
curl http://example.com to connect to your local
server instead of the actual server.
You can normally and easily do that by editing your
hosts file (
on Linux and Unix systems) and adding, for example,
127.0.0.1 example.com to
redirect the host to your localhost. However this edit requires admin access and
it has the downside that it affects all other applications at the same time.
Change the Host: header
Host: header is the normal way an HTTP client tells the HTTP server which
server it speaks to, as typically an HTTP server serves many different names
using the same software instance.
So, by passing in a custom modified
Host: header you can have the
server respond with the contents of the site even when you didn't actually
connect to that host name.
For example, you run a test instance of your main site
your local machine and you want to have curl ask for the index html:
curl -H "Host: www.example.com" http://localhost/
When setting a custom
Host: header and using cookies, curl will extract the
custom name and use that as host when matching cookies to send off.
Host: header is not enough when communicating with an HTTPS server. With
HTTPS there's a separate extension field in the TLS protocol called SNI
(Server Name Indication) that lets the client tell the server the name of the
server it wants to talk to. curl will only extract the SNI name to send from
the given URL.
Provide a custom IP address for a name
Do you know better than the name resolver where curl should go? Then you can
give an IP address to curl yourself! If you want to redirect port 80 access for
example.com to instead reach your localhost:
curl --resolve example.com:80:127.0.0.1 http://example.com/
You can even specify multiple
--resolve switches to provide multiple
redirects of this sort, which can be handy if the URL you work with uses HTTP
redirects or if you just want to have your command line work with multiple
--resolve inserts the address into curl's DNS cache, so it will effectively
make curl believe that's the address it got when it resolved the name.
When talking HTTPS, this will send SNI for the name in the URL and curl will verify the server's response to make sure it serves for the name in the URL.
Provide a replacement name
As a close relative to the
--resolve option, the
provides a minor variation. It allows you to specify a replacement name and
port number for curl to use under the hood when a specific name and port
number is used to connect.
For example, suppose you have a single site called
www.example.com that in turn
is actually served by three different individual HTTP servers: load1, load2
and load3, for load balancing purposes. In a typical normal procedure, curl
resolves the main site and gets to speak to one of the load balanced servers
(as it gets a list back and just picks one of them) and all is well. If you
want to send a test request to one specific server out of the load balanced
load1.example.com for example) you can instruct curl to do that.
You can still use
--resolve to accomplish this if you know the specific IP
address of load1. But without having to first resolve and fix the IP address
separately, you can tell curl:
curl --connect-to www.example.com:80:load1.example.com:80 http://www.example.com
It redirects from a SOURCE NAME + SOURCE PORT to a DESTINATION NAME +
DESTINATION PORT. curl will then resolve the
load1.example.com name and
connect, but in all other ways still assume it is talking to
Name resolve tricks with c-ares
As should be detailed elsewhere in this book, curl may be built with several different name resolving backends. One of those backends is powered by the c-ares library and when curl is built to use c-ares, it gets a few extra superpowers that curl built to use other name resolve backends don't get. Namely, it gains the ability to more specifically instruct what DNS servers to use and how that DNS traffic is using the network.
--dns-servers, you can specify exactly which DNS server curl should use
instead of the default one. This lets you run your own experimental server that
answers differently, or use a backup one if your regular one is unreliable or dead.
--dns-ipv6-addr you ask curl to "bind" its local
end of the DNS communication to a specific IP address and with
--dns-interface you can instruct curl to use a specific network interface to
use for its DNS requests.
--dns-* options are very advanced and are only meant for people who know
what they are doing and understand what these options do. But they offer very
customizable DNS name resolution operations.
curl will typically make a TCP connection to the host as an initial part of its network transfer. This TCP connection can fail or be very slow, if there are shaky network conditions or faulty remote servers.
To reduce the impact on your scripts or other use, you can set the maximum time
in seconds which curl will allow for the connection attempt. With
--connnect-timeout you tell curl the maximum time to allow for connecting,
and if curl has not connected in that time it returns a failure.
The connection timeout only limits the time curl is allowed to spend up until the moment it connects, so once the TCP connection has been established it can take longer time. See the Timeouts section for more on generic curl timeouts.
If you specify a low timeout, you effectively disable curl's ability to connect to remote servers, slow servers or servers you access over unreliable networks.
The connection timeout can be specified as a decimal value for sub-second precision. For example, to allow 2781 milliseconds to connect:
curl --connnect-timeout 2.781 https://example.com/
On machines with multiple network interfaces that are connected to multiple networks, there are situations where you can decide which network interface you would prefer the outgoing network traffic to use. Or which originating IP address (out of the multiple ones you have) to use in the communication.
Tell curl which network interface, which IP address or even host name that you
would like to "bind" your local end of the communication to, with the
curl --interface eth1 https://www.example.com/ curl --interface 192.168.0.2 https://www.example.com/ curl --interface machine2 https://www.example.com/
Local port number
A TCP connection is created between an IP address and a port number in the local end and an IP address and a port number in the remote end. The remote port number can be specified in the URL and usually helps identify which service you are targeting.
The local port number is usually randomly assigned to your TCP connection by the network stack and you normally don't have to think about it much further. However, in some circumstances you find yourself behind network equipment, firewalls or similar setups that put restrictions on what source port numbers that can be allowed to set up the outgoing connections.
For situations like this, you can specify which local ports curl should bind the the connection to. You can specify a single port number to use, or a range of ports. We recommend using a range because ports are scarce resources and the exact one you want may already be in use. If you ask for a local port number (or range) that curl can't obtain for you, it will exit with a failure.
Also, on most operating systems you cannot bind to port numbers below 1024 without having a higher privilege level (root) and we generally advise against running curl as root if you can avoid it.
Ask curl to use a local port number between 4000 and 4200 when getting this HTTPS page:
curl --local-port 4000-4200 https://example.com/
TCP connections can be totally without traffic in either direction when they are not used. A totally idle connection can therefore not be clearly separated from a connection that has gone completely stale because of network or server issues.
At the same time, lots of network equipment such as firewalls or NATs are keeping track of TCP connections these days, so that they can translate addresses, block "wrong" incoming packets, etc. These devices often count completely idle connections as dead after N minutes, where N varies between device to device but at times is as short as 10 minutes or even less.
One way to help avoid a really slow connection (or an idle one) getting treated as dead and wrongly killed, is to make sure TCP keep alive is used. TCP keepalive is a feature in the TCP protocol that makes it send "ping frames" back and forth when it would otherwise be totally idle. It helps idle connections to detect breakage even when no traffic is moving over it, and helps intermediate systems not consider the connection dead.
curl uses TCP keepalive by default for the reasons mentioned here. But there might be times when you want to disable keepalive or you may want to change the interval between the TCP "pings" (curl defaults to 60 seconds). You can switch off keepalive with:
curl --no-keepalive https://example.com/
or change the interval to 5 minutes (300 seconds) with:
curl --keepalive-time 300 https://example.com/