How can I make HTTP requests to web APIs

What is a web API?

A web API is a server that listens for an HTTP request, and is used largely to manage online data. There are many reasons why web APIs exist, but one of the easiest to see is the fact that most companies that are largely based on the web (i.e.: Facebook, Google, Twitter, etc) need both a web app (for using in your favorite internet browser) and a mobile app (largely to run on phones and tablets). As a result, many of these companies have built their web apps so that they both can call a single web API that manages their data. This allows engineers to ensure that all the data is kept consistent between the two applications and so that application programmers working just on the web app or the phone app can focus on their individual applications and call the web API as they need to interact with the data they both share.

Web API architecture

Basic Components of a web request to an API

Web APIs typically use HTTP to facilitate web requests. This makes it feel more natural calling them as part of loading up web resources. It essentially works like making any other HTTP request to a website to load a page. The difference here is that instead of getting a page, you’re managing data.

URL

A URL in HTTP is just like a URL you use to navigate to a website. It consists of the scheme (http or https), the IP address and port, and the path to the page or resource your accessing.

http://127.0.0.1:8080/aboutUs

The scheme is http. The IP address (in this case the loopback address for a website running on my personal computer) identifies the server (a computer that is listening for incoming requests and sends back responses to valid requests) to send the request to. The port specified is port 8080 (this is like a specific door on your computer you let network traffic in on). The last piece is the logical path on the server to get a page or resource. In this case, it is the aboutUs page on the website.

Scheme

For this article, I am not planning on getting too deep into schemes since for the vast majority of http requests I deal with in sending from a program is either http or https. In short, http is sent via plain text and https is encrypted. For all the libraries and tools I’ve used, any differences in handling http vs https are all handled under the covers, so the difference is really only based on whether a server you’re sending a request to allows http requests or requires https requests.

IP Address and DNS Resolution

This is another section that can have a whole book written on it. All that you really need to know is that IP addresses are used to uniquely identify a server to send a request to. Because most of us would rather use names like “google.com” or “facebook.com,” rather than a series of random numbers, DNS servers were made to translate these easy-to-remember names to IP addresses that computers can understand. All the http libraries I’ve ever used utilize this same functionality, so you can use either an explicit IP address or a name that gets resolved by a DNS.

Port

Without getting too deep into this, a port can be thought of as a door on a server that allows traffic in. That means, if you send a request to google.com, HTTP needs to specify a port it is sending the request to. Without the port, it’s like walking to your friend’s apartment building without knowing which apartment they’re in. Without knowing the correct door to knock on, you can’t contact your friend.

There are default ports for different kinds of traffic. For http, the default port is 80. For https, the default port is 443. Because this is standardized, any libraries you use to send requests will use these default ports for the scheme you’re using unless you specify otherwise. In the example above, I had my server listening on port 8080, so the http request I sent had to explicitly specify that port. The vast majority of the time, web APIs will be listening on their default ports. There is rarely a reason to be listening on any port besides the default.

Route

The route is largely defined by the person who set up the server or the website. When on a website, the route largely follows the same concept as a file path. The root path would be the home page (starting at “/”). Then any of the main pages you can get to from there (such as aboutUs) in the above example would have that added on so that the path is “/aboutUs.” A website doesn’t need to be organized in this way necessarily, but that has been a common pattern.

For web APIs, these are usually arranged to represent a collection of things narrowed down by increasingly specific features. While this can vary, the pattern is pretty standard. For example, if you had a web API that returned a list of customers, you may have something like the following:

/customers/{id}

The “customers” piece designates that you’re looking at a collection of customers, and “{id}” indicates that you’re passing in an id representing a specific customer, so your request might look like the following:

/customers/325

Another endpoint on the web API might provide products, and the same kind of pattern would be used to get back a specific product:

/products/22

Data

While there are other data formats, the most commonly used on the web, largely because of its simplicity is JSON. When making calls to web APIs plan on passing and receiving data as JSON. In our customer example, a JSON object for a customer could be something like:

{
    "id": 5,
    "firstName": "Bob",
    "lastName": "Smith",
    "phone": "123-555-0125",
    "address": "123 Anywhere St"
}

This format is used both when data is received from a web API and when sending data to the web API.

Methods

HTTP uses different methods to give the server information on the specific nature of the request. Typically, a collection (in our example customers) will have HTTP methods corresponding to the typical CRUD operations that commonly performed on data.

GET

The GET request is used to read data from a web API. Using the customer endpoint mentioned above, a GET request to retrieve a customer with a given id is going to be something like the following (with the exception of any other headers that may be included as part of this request):

GET /customers/325 HTTP/1.1
Host: 127.0.0.1
...

No body is passed in. The request here uses just the route (/customers/325), and the server uses that to find and return the customer that has an id of 325.

{
    "id": 325,
    "firstName": "Bob",
    "lastName": "Smith",
    "phone": "123-555-0125",
    "address": "123 Anywhere St"
}

PUT

The PUT request is used to update a resource. Using our example from above, it may be that our customer needs his name updated, so we would do a PUT request similar to the following (excluding the other headers that may be present for the request):

PUT /customers/325 HTTP/1.1
Host: www.tutorialspoint.com
...

{
    "id": 325,
    "firstName": "Bobby", // Updated to "Bobby"
    "lastName": "Smith",
    "phone": "123-555-0125",
    "address": "123 Anywhere St"
}

It’s not necessarily required, but often a PUT request will return back the updated resource, and if this web API endpoint did the same, we would get back the following body data:

{
    "id": 325,
    "firstName": "Bobby",
    "lastName": "Smith",
    "phone": "123-555-0125",
    "address": "123 Anywhere St"
}

POST

A post request is used to create a new resource. If we wanted to create a new customer, we would send a POST request to the web API:

POST /customers/325 HTTP/1.1
Host: www.tutorialspoint.com
...

{
    "firstName": "John",
    "lastName": "Smith",
    "phone": "123-555-0138",
    "address": "343 Anywhere St"
}

You may have noticed that in the body, I’m not passing in an id for the new customer, whereas in a PUT request we did. This is because the id is often generated by the web API and handed back to us when the new customer is created. The common thing that you’ll see is when the request completes successfully, you’ll get a response with the info on the new customer as follows:

{
    "id": 44,
    "firstName": "John",
    "lastName": "Smith",
    "phone": "123-555-0138",
    "address": "343 Anywhere St"
}

This is often useful because we may have no other way to identify a customer other than a unique id. For example, we might have 100 customers named “John Smith,” and while it’s possible to use all pieces of data to uniquely identify a customer, it’s not always a guarantee that it will return on unique customer. Beyond that, a big thing about making requests is that their costly in compute time, so if you can get the id as part of a single request rather than making another request to search for a customer, you’re helping keep your program from running slow.

DELETE

As the name implies, this is for deleting a resource. In our example, we have a list of customers, and maybe that customer moved, and so we need to remove them from the database. A DELETE request will look similar to GET:

DELETE /customers/325 HTTP/1.1
Host: www.tutorialspoint.com
...

This will identify the customer to be deleted, and it may or may not return the info on the deleted customer (depending on who has set up the web API).

HTTP Status Codes

Up to this point, I have talked about requests that don’t result in any errors from the server. The reality is that there are plenty of times that you’ll get an unsuccessful response from a web API. Status codes are how we can determine whether a request completed successfully or if it failed, and if so, what might have gone wrong.

Status codes come in ranges, and these ranges indicate the general reason for the failure. I will go through the ranges and specific status codes I have seen by and large the most. There are others out there that are much less used, and a Google search can be used to get specific info on each of them.

2xx Status Code Range

Status codes starting at 200 (and in the 200s) indicate a success response. The two specific status codes I have seen are 200 and 204.

200 OK

This status code is by far the most common one I have seen when making HTTP requests. It will often be associated with content from a response. For our GET example, if everything went well on the server’s end, then we would get something similar to the following response:

HTTP/1.1 200 OK
...

{
    "id": 325,
    "firstName": "Bob",
    "lastName": "Smith",
    "phone": "123-555-0125",
    "address": "123 Anywhere St"
}

While this somewhat rehashes the fact that the GET endpoint returns, I specifically wanted to mention the status code to show what that would look like.

204 No Content

Some requests will not return any data after they complete. For example, the PUT request mentioned above may not return anything because it is just assumed that if you get a 2XX response, it worked, and you can just move on. I have not seen these near as much as 200, but it still shows up occasionally.

HTTP/1.1 204 No Content
...

4XX Status Code Range

4XX status codes indicate that there was something wrong with the request itself. Specifically, the client making the request could be doing anything such as requesting data that isn’t there, sending unrecognized or invalid information, is not authorized etc. There are a whole myriad of reasons why a 4XX (often just called a “four hundred error”) error would occur. The web API should respond with a reason that the client can read and address appropriately. The most common ones I’ve seen are 400, 401, 403, and 404.

400 Bad Request

The web API will look at the request that it is getting and validate the data to make sure it makes sense before attempting to actually fulfill the request. This prevents the web API from throwing exceptions or crashing (or worse, executing malicious code). Using our customer example, the following request would likely result in a 400 Bad Request response since it is missing a phone number.

PUT /customers/325 HTTP/1.1
Host: www.tutorialspoint.com
...

{
    "id": 325,
    "firstName": "Bobby"
    "lastName": "Smith",
    // missing phone number
    "address": "123 Anywhere St"
}

401 Unauthorized

When a user is not authorized to access a given endpoint, then this status code is returned.

I didn’t go into authorization in this post because to properly address it an entire post should be dedicated to it. In short though, when logging into a web app (like Facebook), you’ll send a request with a username and password, and the website will send back an authorization token (which is going to look like a long string of random letters and numbers). This string is passed as a header in your HTTP requests to let the web API know who you are so it can know that you’re who you’re logged in as (authentication) and that you can access the resource you’re looking for (authorization). A request with an authorization token will look something like this:

GET /customers/325 HTTP/1.1
Host: 127.0.0.1
Authorization: 23JK2L33HJ23HK2HK23J
...

In short, if you are trying to get your friends list from the Facebook API, you will need to provide a valid authorization token in the header of your request or you will get a 401. Additionally, if you try to access someone else’s friend list with your access token, you’ll get a 401.

403 Forbidden

I don’t see this one very often, but this occurs when the web API refuses to fulfill the request regardless of whether or not you’re authorized. This often may because application authorization on the server may prevent it from doing certain things. Also, some web APIs will mistakenly return this status code when a user is not authorized. So while a web API shouldn’t return this status code because you’re unauthorized, it still may be worth checking to ensure that you’re including the correct authorization header to make sure you’re not getting this because the people who created this web API set it up to return this status code instead of 401.

404 Not Found

This occurs literally when the resource is not found. In our case for the customer endpoint we have the following request:

GET /customers/325 HTTP/1.1
Host: 127.0.0.1
...

If a customer with an id of 325 does not exist, then a 404 will be returned.

In general, 4XX errors are something that you can anticipate and recover from programmatically or which you can debug to ensure you’re setting up your requests correctly.

5XX Status Code Range

This range of status codes indicates that an error occurred on the web API server. The only 5XX status code I have seen is just 500.

500 Internal Server Error

On a well-designed and well-tested web API server, this literally means that there was an internal server error that prevents it from completing the request. Typically this results from an uncaught exception that occurred during the execution of your request. Without direct access to the server code, it’s highly unlikely you’ll be able to pin down the cause. It could be from a range of any number of things, and it’s possible that it maybe temporary. It’s possible that a minute later, the same request works.

I have typically seen 500 as a catch-all error. What I mean by that is that the web API should do some validation of the incoming request and account for the majority of the errors that might occur and respond with a 4XX error indicating why the failure occurred. 500 accounts for all other errors that could occur (and the number of different possibilities is huge, though for each of these to occur should be pretty rare).

Aside from what the ideal reason for why a 500 should be returned, I have seen them occur in the real world when a web API endpoint does not properly validate an incoming request, and rather than returning a 4XX, it attempts to complete a request with bad data that results in an uncaught exception. Some of these are just corner cases that the engineers of the web API hadn’t considered. Sometimes it’s because there’s a large push for a deadline, so the amount of testing and design required for quality didn’t have sufficient time to ensure everything is properly validated. (I’ve seen this happen in large, well-established companies plenty of times). The point being that even if you get a 500 error, it is often worth it to double-check your request to ensure that it looks good. And of course, if you have access to the server, look at where the exception in the code is occurring to ensure that it is not a corner case you might have missed.

What should I do now?

This was a really brief overview of HTTP requests as they relate to web APIs. I would recommend learning how to use Postman to practice sending and receiving requests. There are plenty of web APIs out there that allow public access. You can check out Google’s, Zillow’s, or Facebook’s web APIs. These sites will require you to set up an account with them and create some kind of access token but can be a great way to learn about web APIs on well known websites. If you want to avoid all the extra setup, go check out Free Public APIs for Developers for public APIs you can practice on. You can use Postman as an easy way to create requests and get responses. Knowing how to do this via Postman gives you an easy way to call them from a program. I’ll be writing additional posts in the future on how to use Postman as well as how to write programs to call these APIs. Until then, have fun building and exploring.

Further Reading

Leave a Reply

Your email address will not be published. Required fields are marked *