Lab 2.2: Becoming RESTful

Overview

In android lab 1.3, you performed GET request to read some JSON or XML data from an external server, here we create that server (or the cloud, whatever marketing people like to call). The goal of this lab is not dynamic web development, its about creating a Web API.

In this lab, you will learn:

  • Typical development/testing workflow.
  • Run any binary on your server through CGI.
  • Writing URL on the fly, using Apache mod_rewrite
  • A simple RESTful server example

During this lab, we will frequently modify httpd.conf file. therefore, to make life easier, create a symbolic link inside /etc/ directory.

If you get the following error, then you have already created it previously.

 

Optional:

Quickly skim this link which describes httpd.conf  file structure.

Install PHP

PHP is a popular scripting language for the server side. You can also run PHP scripts without Apache. If you want to use PHP in Apache, then you need to add mod_php to your Apache modules (check /usr/local/apache2/modules/ to find available modules).

Follow these steps to install php on your erver.

  • Login to your erver.
  • Install required package
  • Download and install the latest PHP
  • Now, you should find PHP shared library (.so) in ‘/usr/local/apache2/modules/’. Try this nice piped command.
  • create a directory in /var/www called php.
  • Edit “/etc/conf/httpd.conf”:
  • Add the following configuration at the end:
  • Save and clode the file (press ‘CTRL+X’, then press ‘y’, then enter).
  • Restart Apache server.
  • create a simple test script inside /php/ directory called index.php:
  • In your browser check the new link.

Workflow

One thing you should know when programming on the server side is that usually you don’t have a nice IDE reporting warning and errors and you have no personal secretary  around.  Therefore you have to find some strategy to write/test your code and detect errors. If you write some web applications, then you can use a web browser to call to GET methods. This can easily be done by calling URL with variables (e.g., myserver.com/form.php?name=value), but how about POST data? You can’t embed it in the URL. One solution will  be creating a simple HTML form in your local machine, but again, the problem is that HTML forms encode  variables using mime-type (application/x-www-form-urlencoded).

An easy way to deal with this issue is a simple tool called ‘nc’ (NetCat). nc enables you to open an arbitrary port and send  ASCII data to the destination. You can also run it in listening mode. If you are using Linux, open a terminal on your local machine and run:

Open another terminalon your local machine and run

Now, type something in one terminal and press enter. You will see your message delivered to the other terminal.

You can use any IP address if you know which server and which port to communicate with. The default is TCP  communication, though you can change it to UDP as well. NetCat tool is suitable only for plain text protocols which are actually many (HTTP, SMTP, IMAP), so this is a general way to connect to a server.

Now, lets perform a GET request to your hello world page of last lab

You will get your hello world HTML code. You will use this technique to send POST, PUT, and DELETE requests with some JSON data attached to the server.

Sometimes if your server application contains some errors (e.g., segmentation fault), apache server returns “HTTP/1.0 500 Server Error”.  You will find logs in ‘/usr/local/apache2/logs/’. Use tail command to show you the tail of log file (-f means follow).

Common Gateway Interface (CGI)

You can think of CGI as wrapper that calls any executable in your server from the web. Fortunately, Apache default configuration (./configure) provides CGI module inside Apache (You can find all modules installed in /usr/local/apache2/modules/).

  • Edit Apache configuration (sudo nano /etc/httpd.conf), and add the following line at the end:

    Save file and exit (CTRL+X => ‘Y’ => ENTER).
    AddHandler means file extensions such as (.sh .pl .py ) will be handled by CGI module. You may add more, or just keep .cgi and rename your executable to be cgi.
  • Create a directory for CGI executables.
  • Restart httpd server.

Now you are ready to write your first hello world application. Remember, you can use any language you know. Lets start with a simple Linux Bash script.

  • Create a file:
  • Edit the file as a root (sudo nano /var/www/cgi/test.sh)
  • Remember from the pre-lab that Content-type is within HTTP header.
  • Make the file executable
  • In your browser, check the link ( <your server address>/cgi/test.sh )
  • It should print hello world!

Next Step

In order to build a real web app, you should be able to read HTTP headers (request type, content type, client ip…etc), and contents. Fortunatlly, CGI provides all required variables as environment variables. To see the list of variables, add to your test.sh:

In your browser, check your application link, you will find something like:

Within these variables, you get everything you need. For example REQUEST_METHOD tells you whether the client call is GET, POST, PUT or DELETE, REMOTE_ADDR is the client IP., and most importantly is QUERY_STRING, where you get URL variables.

  • In your browser, type a link <your server>/cgi/test.sh?var1=myname
  • You will see the query inside QUERY_STRING environment variable.

Now, lets try POST request. In your local computer use nc command (If using windows: download for NetCat tool somewhere online).

  • Execute the following HTTP request (Make sure you understand it):
  • Note that there should be an extra empty line between HTTP header and content (The system will not read it as blank line if you copy/paste the above code directly). The server will return an HTTP respond similar to:
  • Content type is application/x-www-form-urlencoded which is an standard encoding used usually in HTML forms and URL variables in GET requests. However, you can’t read it (QUERY_STRING is empty).
  • In your server side, you can read POST contents from standard input (this is how CGI provides data). Add to test.sh:
  • Learn about dd command (in your terminal type: man dd), its quite handy in Linux. It reads a byte (bs=1)  CONTENT_LENGTH times. ‘2>’ means redirect error messages to null device. “`” symbol means execute the command, and the output will be stored in the variable.
  • Try to send some JSON data:
  • Again note that there should be an extra empty line between HTTP header and content (The system will not read it as blank line if you copy/paste the above code directly).
    You should see JSON data read and responded back from the server. If you want to send some different JSON data, make sure the content length matches the number of characters you want to send in the content section.
  • Check the content type variable. Make sure the response it “application/json”.

Writing URL on the fly

By default, Apache creates a URLs to match directories and files. We will use Apache mod_rewrite to override this setting and redirect any URL a client provides to some server executable. This module is compiled already, so all you need is to enable it and configure your Apache.

From here on we give PHP examples. You may write similar code using your favorite language, since all you need is to be able to read environment variables, read from standard input and knowledge on how to do JSON parsing (encoding and decoding). We will use Apache mod_php  module to run PHP script as  we described in the pre-lab. The environment variables are similar to CGI plus some extra PHP variables. Check PHP site for the full list. 

GET requests

Lets configure a simple application. We want to respond to any address with this format:  “[server]/greeting/<some_name>” by calling “/php/get.php?name=some_name”.

Create a folder ‘php’ and ‘get.php’ inside it.

Add the following lines to get.php, then save and close file:

  • Edit Apache config file (sudo nano /etc/httpd.conf), and add the following at the end:
  • Also add the following, save changes and restart apache2:
  • RewriteRule directive is the core of the whole thing. The first argument is the regex match of the given URL given by the client. The second part is the application to execute on the server. Note that the  parentheses in perl regex is referred by variable $1 which is given to our script. In your browser type <server dns>/greeting/<your name>, and you should get some nice greeting in response.
  • You can also restrict certain directories from calling, say /test/ directory. You can return forbidden HTTP code (403), or perhaps page not found (404). Add the following rule to the above code (i.e. IfModule mod_rewrite.c) :
  • Flags are given in the parentheses  ‘F’ means forbidden. You can find the complete flag list in apache documentation.
  • Note that rewrite rules are executed in order from up to bottom. If you don’t want other rules to be executed after some rule, you may add last flag ‘L’.

Now, you can give any directory structure to the user. This is a very important REST rule where data should be given as directory structure. Next is the POST/PUT/DELETE requests. The question is, how can we pass variables to our script? We know that we can’t embed variables in the URL in POST request.

POST/PUT/DELETE requests

These types of requests carry their own contents, not in the URL, so one solution is to ask mod_rewrite to add to some URL regex parts to the content; however, this is a bit difficult to achieve because content type could be any mime type not only json or xml. A simple trick to solve this issue is to store in environment variables.

  • We use POST to create something on the server. Create a simple script say post.php. Assume we have already a resource named “X” stored in the server. The script  returns http 201 status (Created) if the name is not “X”; on the other hand it  returns status  409 (Conflict):
  • Open httpd.conf and change the previously added code (i.e. IfModule mod_rewrite.c) as follows:

RewriteCond is used to extend the proceeding RewriteRule regex condition. For example,  regex cannot tell from the URL whether its POST or GET. RewriteCond checks for Apache  environment variables (the same variables we have seen in CGI), and checks the content of the variable . In our case, we check the environment variable REQUEST_METHOD, if it matches “^GET” (i.e., starts with GET character sequence),  RewriteRule continues with its action. Note that the RewriteRule is executed before RewriteCond to check the URL format first. We could have more than one RewriteCond above any RewriteRule.

In the second rewriteRule, note the we have added a new flag ‘e’. The expression above means, create a new environment variable named “name”, and give the value $1 which the first matched bracket in the regex.

  • Anytime you modify /etc/httpd.conf, you need to  stop and start Apache server (note: restart sometime is buggy).
  • In your client terminal, send a POST request:
  • You should get 409 (Conflict). If you are confused on which status to return in your code, check this link.
  • Try another post with different name say:
  • You will get 201 (Created) status.

From these few configurations, you can have endless possibilities on how to configure your site. DELETE and PUT requests can be configured exactly as we discussed, the only difference is the REQUEST_METHOD value.

Notes

CGI is quite handy because you can use any programming language; however, its performance is not good at all. The reason is that whenever a client requests a server code, Apache creates a new CGI process. Imagine you have thousands of calls to the server, you will have thousands of processes which adds huge overhead to your server. The solution to this limitation is either using apache modules for a particular language ( such as mod_c , mod_python, mod_php…etc)  or you may use Fast-CGI. Fast CGI runs in an independent process from Apache, and whenever a client requests some executable,  Apache forwards the request to Fast CGI daemon which creates a thread  instead of a process for the corresponding executable. If you want to make your code compatible with fast-cgi, all you need is to  add around two lines of code. Fast-CGI enables you to run you server applications in a different server from a Apache. A nice way to load balance your Apache requests to multiple machines.

Exercise 2.2

Create a simple Web API through which we can create, retrieve, modify and delete student profiles. Each profile has unique URL specified by student ID. The  URL format is given by “localhost/profiles/<student ID>” (e.g., /profiles/mkhonji). The content should be some JSON data key value pairs such as first name, second name, age and GPA. You should do the following:

  • When a client makes a POST request: create a file <student ID>.json and store the content of the POST request plus the “ID”: “<student id>” field taken from the URL. If the resource is already there (i.e., file exists), 409 status should be returned.
  • When a client makes a GET request: return content of the corresponding json file, or if the file is not there, return 404 (not found) status.
  • When a client makes a PUT request: replace the corresponding file content with request payload plus ID field as in the POST request. If the file doesn’t exists, return status 204 (No Content)
  • When a client makes a DELETE request: delete the corresponding file, or if the file is not there, return status 204 (No Content).

Hint

If you are going to program in PHP, you can use json_encode and json_decond functions to parse json data. Here is a simple example on how to read and modify some JSON  data. We add a new field “name” and give it some value, then we merge it with the previous content.


Reading files is easy in PHP. Check this site. For writing, you may use this:

 

Submission guidelines

  • The exercise should be done individually.
  • Copy the final exercise files (i.e., the php scripts and httpd.conf) into “lab-2.2” directory.
  • Add, commit, and push your changes to the remote server
  • The deadline is before the next lab.

Reference