Tutorial 8 - Sockets and epoll #
In this tutorial we useepoll
functions family to wait on multiple descriptors.epoll
is not a part of POSIX, but Linux extension. If you want to write portable code, you should look atselect
orpoll
functions - which are standarized, but have worse performance and are less convenient to use.
Introduction notes:
- It’s worth to familiarize yourself with netcat befaore the lab
- All materials from OPS1. OPS2 are still obligatory, especially tutorials on threads and processes!
- Quick look at this material will not suffice, you should compile and run all the programs, check how they work, read additional materials like man pages. As you read the material please do all the exercises and questions. At the end you will find sample task similar to the one you will do during the labs, please do it at home.
- Codes, information and tasks are organized in logical sequence, in order to fully understand it you should follow this sequence. Sometimes former task makes context for the next one and it is harder to comprehend it without the study of previous parts.
- Most of exercises require command line to practice, I usually assume that all the files are placed in the current working folder and that we do not need to add path parts to file names.
- Quite often you will find $ sign placed before commands you should run in the shell, obviously you do not need to rewrite this sight to command line, I put it there to remind you that it is a command to execute.
- What you learn and practice in this tutorial will be required for the next ones. If you have a problem with this material after the graded lab you can still ask teachers for help.
- In many cases in the below codes post-interruption restarting macro TEMP_FAILURE_RETRY was deployed even though not all the programs handle the signals. It was done due to the fact that most of the presented code will be reused by students in their solutions that may require signal handling and the fact that a lot of presented functions can be reused to build a library for your future codes, thus it should be as portable as possible.
- It is essential to consider the architectural differences when planning the protocol that is used in communication over the network. Please always consider the followin
- byte order - in what order of bytes the integer number is stored in the memory. When sending the number from one order to another without a proper conversion it changes its value (e.g. 0x00FF to 0xFF00). Fortunately you do not need to know your local byte order, instead you always convert all the integers to so called network byte order with the macros htons (1. bit shorts) and htonl (32 bit integers) before sending them and again convert it from this order back to host order with the macros ntohs and ntohl just after receiving them.
- There is no network byte order (or a format) for floating point numbers. As their format varies on different architectures you must send them as text or as fixed precision numbers (integers).
- The one byte character string is not affected by the byte order, in most cases sending the data in textual (human readable) format is a good and safe choice. All one byte data types are byte order safe.
- Integer types in C language do not have standardized size. They can have different size depending on the architecture or even the compiler. To avoid the problems it is safer to use size defined types like int32_t or uint1. _t.
- Sending a structure over the network can cause problems too. It is related to the way compilers layout the structure fields in the memory. On some architectures it is faster to read the memory bytes at addresses divisible by 8 on other by 1. ,32 or other divider. Compiler tries to speed up the access by placing all the fields on fast addresses and it means gaps of unknown size between the members. If those gaps differ then after binary transfer the structure will not be usable. To avoid this, you must be able to tell the compiler to “pack” the structure - remove the gaps, but it can not be achieved in a portable way within the code. To avoid the problem you can send the structure member by member.
Task on local + TCP sockets #
Write simple integer calculator server. Data send to server consists of:
- operand 1
- operand 2
- result
- operator (+,-,*,/)
- status
all converted to 32 bit integers in an array.
Server calculates the results of operation (+,-,*,/) on operands and sends the result back to the client. If operation is possible status returned is 1. otherwise it should be 0. Server must work with 2 types of connection:
- local stream sockets
- network tcp sockets
Server is single process application, it takes 2 parameters:
- local socket file name
- port
Write 2 types of client, one for each connection type, those clients shall take the following parameters:
- address of the host (file name for local connection, domain name for inet)
- port number (tcp client only)
- operand 1
- operand 2
- operator (+,-,*,/)
On success client displays the result on the screen. All above programs can be interrupted with C-c, server may NOT leave local socket file not deleted in such a case.
Solution #
What you must know:
man 7 socket
man 7 epoll
man 7 unix
man 7 tcp
man 3p socket
man 3p bind
man 3p listen
man 3p connect
man 3p accept
man 2 epoll_create
man 2 epoll_ctl
man 2 epoll_wait
man 3p freeaddrinfo (obie funkcje, getaddrinfo też)
man 3p gai_strerror
Pay closer attention at Q&A in man 7 epoll
. It is well prepared so we will not repeat it here.
Common library for all sources in this tutorial:
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <netdb.h>
#include <netinet/in.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/un.h>
#include <unistd.h>
#ifndef TEMP_FAILURE_RETRY
#define TEMP_FAILURE_RETRY(expression) \
(__extension__({ \
long int __result; \
do \
__result = (long int)(expression); \
while (__result == -1L && errno == EINTR); \
__result; \
}))
#endif
#define ERR(source) (perror(source), fprintf(stderr, "%s:%d\n", __FILE__, __LINE__), exit(EXIT_FAILURE))
int sethandler(void (*f)(int), int sigNo)
{
struct sigaction act;
memset(&act, 0, sizeof(struct sigaction));
act.sa_handler = f;
if (-1 == sigaction(sigNo, &act, NULL))
return -1;
return 0;
}
int make_local_socket(char *name, struct sockaddr_un *addr)
{
int socketfd;
if ((socketfd = socket(PF_UNIX, SOCK_STREAM, 0)) < 0)
ERR("socket");
memset(addr, 0, sizeof(struct sockaddr_un));
addr->sun_family = AF_UNIX;
strncpy(addr->sun_path, name, sizeof(addr->sun_path) - 1);
return socketfd;
}
int connect_local_socket(char *name)
{
struct sockaddr_un addr;
int socketfd;
socketfd = make_local_socket(name, &addr);
if (connect(socketfd, (struct sockaddr *)&addr, SUN_LEN(&addr)) < 0)
{
ERR("connect");
}
return socketfd;
}
int bind_local_socket(char *name, int backlog_size)
{
struct sockaddr_un addr;
int socketfd;
if (unlink(name) < 0 && errno != ENOENT)
ERR("unlink");
socketfd = make_local_socket(name, &addr);
if (bind(socketfd, (struct sockaddr *)&addr, SUN_LEN(&addr)) < 0)
ERR("bind");
if (listen(socketfd, backlog_size) < 0)
ERR("listen");
return socketfd;
}
int make_tcp_socket(void)
{
int sock;
sock = socket(PF_INET, SOCK_STREAM, 0);
if (sock < 0)
ERR("socket");
return sock;
}
struct sockaddr_in make_address(char *address, char *port)
{
int ret;
struct sockaddr_in addr;
struct addrinfo *result;
struct addrinfo hints = {};
hints.ai_family = AF_INET;
if ((ret = getaddrinfo(address, port, &hints, &result)))
{
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(ret));
exit(EXIT_FAILURE);
}
addr = *(struct sockaddr_in *)(result->ai_addr);
freeaddrinfo(result);
return addr;
}
int connect_tcp_socket(char *name, char *port)
{
struct sockaddr_in addr;
int socketfd;
socketfd = make_tcp_socket();
addr = make_address(name, port);
if (connect(socketfd, (struct sockaddr *)&addr, sizeof(struct sockaddr_in)) < 0)
{
ERR("connect");
}
return socketfd;
}
int bind_tcp_socket(uint16_t port, int backlog_size)
{
struct sockaddr_in addr;
int socketfd, t = 1;
socketfd = make_tcp_socket();
memset(&addr, 0, sizeof(struct sockaddr_in));
addr.sin_family = AF_INET;
addr.sin_port = htons(port);
addr.sin_addr.s_addr = htonl(INADDR_ANY);
if (setsockopt(socketfd, SOL_SOCKET, SO_REUSEADDR, &t, sizeof(t)))
ERR("setsockopt");
if (bind(socketfd, (struct sockaddr *)&addr, sizeof(addr)) < 0)
ERR("bind");
if (listen(socketfd, backlog_size) < 0)
ERR("listen");
return socketfd;
}
int add_new_client(int sfd)
{
int nfd;
if ((nfd = TEMP_FAILURE_RETRY(accept(sfd, NULL, NULL))) < 0)
{
if (EAGAIN == errno || EWOULDBLOCK == errno)
return -1;
ERR("accept");
}
return nfd;
}
ssize_t bulk_read(int fd, char *buf, size_t count)
{
int c;
size_t len = 0;
do
{
c = TEMP_FAILURE_RETRY(read(fd, buf, count));
if (c < 0)
return c;
if (0 == c)
return len;
buf += c;
len += c;
count -= c;
} while (count > 0);
return len;
}
ssize_t bulk_write(int fd, char *buf, size_t count)
{
int c;
size_t len = 0;
do
{
c = TEMP_FAILURE_RETRY(write(fd, buf, count));
if (c < 0)
return c;
buf += c;
len += c;
count -= c;
} while (count > 0);
return len;
}
server l8-1_server.c
:
#include "l8_common.h"
#define BACKLOG 3
#define MAX_EVENTS 16
volatile sig_atomic_t do_work = 1;
void sigint_handler(int sig) { do_work = 0; }
void usage(char *name) { fprintf(stderr, "USAGE: %s socket port\n", name); }
void calculate(int32_t data[5])
{
int32_t op1, op2, result = -1, status = 1;
op1 = ntohl(data[0]);
op2 = ntohl(data[1]);
switch ((char)ntohl(data[3]))
{
case '+':
result = op1 + op2;
break;
case '-':
result = op1 - op2;
break;
case '*':
result = op1 * op2;
break;
case '/':
if (!op2)
status = 0;
else
result = op1 / op2;
break;
default:
status = 0;
}
data[4] = htonl(status);
data[2] = htonl(result);
}
void doServer(int local_listen_socket, int tcp_listen_socket)
{
int epoll_descriptor;
if ((epoll_descriptor = epoll_create1(0)) < 0)
{
ERR("epoll_create:");
}
struct epoll_event event, events[MAX_EVENTS];
event.events = EPOLLIN;
event.data.fd = local_listen_socket;
if (epoll_ctl(epoll_descriptor, EPOLL_CTL_ADD, local_listen_socket, &event) == -1)
{
perror("epoll_ctl: listen_sock");
exit(EXIT_FAILURE);
}
event.data.fd = tcp_listen_socket;
if (epoll_ctl(epoll_descriptor, EPOLL_CTL_ADD, tcp_listen_socket, &event) == -1)
{
perror("epoll_ctl: listen_sock");
exit(EXIT_FAILURE);
}
int nfds;
int32_t data[5];
ssize_t size;
sigset_t mask, oldmask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigprocmask(SIG_BLOCK, &mask, &oldmask);
while (do_work)
{
if ((nfds = epoll_pwait(epoll_descriptor, events, MAX_EVENTS, -1, &oldmask)) > 0)
{
for (int n = 0; n < nfds; n++)
{
int client_socket = add_new_client(events[n].data.fd);
if ((size = bulk_read(client_socket, (char *)data, sizeof(int32_t[5]))) < 0)
ERR("read:");
if (size == (int)sizeof(int32_t[5]))
{
calculate(data);
if (bulk_write(client_socket, (char *)data, sizeof(int32_t[5])) < 0 && errno != EPIPE)
ERR("write:");
}
if (TEMP_FAILURE_RETRY(close(client_socket)) < 0)
ERR("close");
}
}
else
{
if (errno == EINTR)
continue;
ERR("epoll_pwait");
}
}
if (TEMP_FAILURE_RETRY(close(epoll_descriptor)) < 0)
ERR("close");
sigprocmask(SIG_UNBLOCK, &mask, NULL);
}
int main(int argc, char **argv)
{
int local_listen_socket, tcp_listen_socket;
int new_flags;
if (argc != 3)
{
usage(argv[0]);
return EXIT_FAILURE;
}
if (sethandler(SIG_IGN, SIGPIPE))
ERR("Seting SIGPIPE:");
if (sethandler(sigint_handler, SIGINT))
ERR("Seting SIGINT:");
local_listen_socket = bind_local_socket(argv[1], BACKLOG);
new_flags = fcntl(local_listen_socket, F_GETFL) | O_NONBLOCK;
fcntl(local_listen_socket, F_SETFL, new_flags);
tcp_listen_socket = bind_tcp_socket(atoi(argv[2]), BACKLOG);
new_flags = fcntl(tcp_listen_socket, F_GETFL) | O_NONBLOCK;
fcntl(tcp_listen_socket, F_SETFL, new_flags);
doServer(local_listen_socket, tcp_listen_socket);
if (TEMP_FAILURE_RETRY(close(local_listen_socket)) < 0)
ERR("close");
if (unlink(argv[1]) < 0)
ERR("unlink");
if (TEMP_FAILURE_RETRY(close(tcp_listen_socket)) < 0)
ERR("close");
fprintf(stderr, "Server has terminated.\n");
return EXIT_SUCCESS;
}
l8-1_client_local.c
:#include "l8_common.h"
int make_socket(char *name, struct sockaddr_un *addr)
{
int socketfd;
if ((socketfd = socket(PF_UNIX, SOCK_STREAM, 0)) < 0)
ERR("socket");
memset(addr, 0, sizeof(struct sockaddr_un));
addr->sun_family = AF_UNIX;
strncpy(addr->sun_path, name, sizeof(addr->sun_path) - 1);
return socketfd;
}
int connect_socket(char *name)
{
struct sockaddr_un addr;
int socketfd;
socketfd = make_socket(name, &addr);
if (connect(socketfd, (struct sockaddr *)&addr, SUN_LEN(&addr)) < 0)
{
ERR("connect");
}
return socketfd;
}
void usage(char *name) { fprintf(stderr, "USAGE: %s socket operand1 operand2 operation \n", name); }
void prepare_request(char **argv, int32_t data[5])
{
data[0] = htonl(atoi(argv[2]));
data[1] = htonl(atoi(argv[3]));
data[2] = htonl(0);
data[3] = htonl((int32_t)(argv[4][0]));
data[4] = htonl(1);
}
void print_answer(int32_t data[5])
{
if (ntohl(data[4]))
printf("%d %c %d = %d\n", ntohl(data[0]), (char)ntohl(data[3]), ntohl(data[1]), ntohl(data[2]));
else
printf("Operation impossible\n");
}
int main(int argc, char **argv)
{
int fd;
int32_t data[5];
if (argc != 5)
{
usage(argv[0]);
return EXIT_FAILURE;
}
fd = connect_socket(argv[1]);
prepare_request(argv, data);
if (bulk_write(fd, (char *)data, sizeof(int32_t[5])) < 0)
ERR("write:");
if (bulk_read(fd, (char *)data, sizeof(int32_t[5])) < (int)sizeof(int32_t[5]))
ERR("read:");
print_answer(data);
if (TEMP_FAILURE_RETRY(close(fd)) < 0)
ERR("close");
return EXIT_SUCCESS;
}
l8-1_client_tcp.c
:#include "l8_common.h"
void prepare_request(char **argv, int32_t data[5])
{
data[0] = htonl(atoi(argv[3]));
data[1] = htonl(atoi(argv[4]));
data[2] = htonl(0);
data[3] = htonl((int32_t)(argv[5][0]));
data[4] = htonl(1);
}
void print_answer(int32_t data[5])
{
if (ntohl(data[4]))
printf("%d %c %d = %d\n", ntohl(data[0]), (char)ntohl(data[3]), ntohl(data[1]), ntohl(data[2]));
else
printf("Operation impossible\n");
}
void usage(char *name) { fprintf(stderr, "USAGE: %s domain port operand1 operand2 operation \n", name); }
int main(int argc, char **argv)
{
int fd;
int32_t data[5];
if (argc != 6)
{
usage(argv[0]);
return EXIT_FAILURE;
}
fd = connect_tcp_socket(argv[1], argv[2]);
prepare_request(argv, data);
if (bulk_write(fd, (char *)data, sizeof(int32_t[5])) < 0)
ERR("write:");
if (bulk_read(fd, (char *)data, sizeof(int32_t[5])) < (int)sizeof(int32_t[5]))
ERR("read:");
print_answer(data);
if (TEMP_FAILURE_RETRY(close(fd)) < 0)
ERR("close");
return EXIT_SUCCESS;
}
To run the code:
$ ./l8-1_server a 2000&
$ ./l8-1_client_local a 2 1 +
$ ./l8-1_client_local a 2 1 '*'
$ ./l8-1_client_local a 2 0 /
$ ./l8-1_client_tcp localhost 2000 234 17 /
$ killall -s `SIGINT` prog23a_s
In this solution (and also in the next example) all sources uses common library to avoid implementing functions like bulk_read
many times.
You may be curious why the constant BACKLOG is set for 3, why not 5,7 or 9? In practice any small number would fit here, this value is merely a hint for the operating system. This program will not deal with a large amount of network traffic and it handles connections very promptly so the queue of waiting connections will never be long. When you expect a heavier traffic in your program please test larger values and find the smallest one that suits your needs. Unfortunately this value will be different on different operating systems.
In this code macro SUN_LEN is used. Why not to use sizeof instead? Both approaches work correctly. You should know, that the sizeof will return slightly larger size than the macro due to the count method. Unlike sizeof, the macro does not count the gap between the members of the address structure. The implementation expects the smaller value of those two but as the address is just a zero delimited string the larger size value has no effect on the address itself. What should you choose? In this tutorial sizes are calculated with the macro SUN_LEN as the standard demands. In this way we save a few bytes of the memory reserved for the address at the cost of a few CPU cycles more to count the size properly. If you decide that CPU has more priority over memory you can choose to use sizeof it will not be considered as an error.
POSIX standard states that if network connecting gets interrupted it must continue asynchronously to the main code. It is quite logical if you remember that connecting engages two processes usually on different computers. If signal handling function interrupts the connect you can not restart as usual (using TEMP_FAILURE_RETRY
), it will result in EALREADY error. In our client code we do not handle signals, but how it can be done? You need to check if connect
was interrupted (errno==EINTR
) and then use select
, poll
, or epoll*
until socket will be ready to write.
As we always have to implement asynchronous waiting for connect, it may be reasonable not to waste the time on the connection and plan some code to run in the meantime, we can set the socket into nonblocking and have the connecting done in the background even if connect is not interrupted.
This program implements a trivial network data exchange schema (so called protocol). Client connects and sends the request, server responds and both parties disconnect. Much more complex protocols are possible in your programs of course.
All data exchanged up to given point of connection time creates what we refer to as a context of the transmission. Depending on the connection protocol context can be more or less complex. In this program the only context that exists is the question sent by the client. The response is based on the context (the question).
You may ask why macros ntohl and htonl are in use as the connection is local and byte order should not be a concern? First of all, the same code will be reused in the second stage for TCP network connection that demands the conversion. The second reason is that local sockets are not limited to work only on local file systems. In future, it may be possible to create such a socket on a network file system and connect two different machines/architectures. Then the byte order would be an issue that this code can handle.
In this code, function bulk_read is used, it is important to know how this function will work in case of nonblocking descriptor. If data is not available it will return immediately with EAGAIN error. Is this the case in this code? Is the descriptor in not blocking mode? Newly created descriptor returned by accept function does not have to inherit the flags! In case of Linux flags are indeed not inherited so as far as Linux is used not blocking mode of the descriptor does not interfere with be bulk_read. In fact it wold not cause problems also on the platforms that inherit the not blocking flag due to the fact that we call bulk_read after we learn that the data is already available.
You are obliged to use getaddrinfo function, the older gethostbyname function is described as obsolete in the man page and can not be used in your solutions.
How can you check on the socket file after you started the server?Answer:
$ls -l a
What is the role of epoll_pwait
call in this program?Answer:
This is the point the program waits for data input from the descriptors and at the same time waits for SIGINT signal delivery
Can we replace epoll_pwait
with epoll_wait
?Answer:
Yes, but it is not worth the effort as then proper SIGINT handling would be much more code demanding
Why network listening socket is in non blocking mode?Answer:
With this descriptor in blocking mode it is possible to block the program on “accept” call. Consider the scenario of the client trying to connect and then unexpectedly vanishing from the network. It may happen that epoll_pwait
will report the socket to be ready for a connection but after the call the new client will disappear and at the time of accept call there will be no one to connect. The program will simply block on the accept until another client arrives. The not blocking mode prevents this situation.
Why program uses int32_t (stdint.h) instead of plain int?Answer:
Due to various sizes of the int on different architecture.
Why SIGPIPE is ignored in the server?Answer:
It is generally easier to handle the broken pipe condition by checking for EPIPE error rather than handling the SIGPIPE signal when the reaction tho the disconnection is not critical - server only closes the current connection and continues to serve other clients.
Why bulk_read nad bulk_write are used in the program? SIGINT is terminating the program thus interruption of read and write should not be a problem.Answer:
For the same reason TEMP_FAILURE_RETRY is so common in the code - portability of this code to your solution, with bulk_read/write the code is interruption proof.
What is the purpose of the unlink in the server code?Answer:
It removes the local socket the same way it removes a file or a fifo - clean up function.
What is the purpose of socket option SO_REUSEADDR?Answer:
This option allows you to quickly bind to the same port on the server if the program run a moment ago. It is essential for testing when you encounter a bug, quickly correct it and then want to run the program again. Without this option system will block binding to the port for a few minutes.
Does the above socket option mean that the packets from previous connection can still be received in a new session?Answer:
No, TCP protocol is immune to this kind of distraction, if it was UDP then the answer would be yes.
Why the address INADDR_ANY is used for the server, what is the value of this constant?Answer:
This is a special address, the value is 0.0.0.0. It means any address. It is used as local address of the server in place of real IP of the server (server can have more that one IP). It does not mean the program will be caching all the Internet connections! If the connection is directed (routed) to our server then we do not care what IP address client used, if ports match then connection is handled without IP matching. Very popular and convenient solution.
TCP client code and local client code are very similar, as a exercise integrate those two types of client into one program with parameter swich (-p local|tcp)
Task 2 - UDP #
Goal:
Write client and a server program that communicate over UDP socket. Client task is to send a file divided into proper size datagrams to the server. Server prints out the received data without information about the source. Each packet send to server must be confirmed with return message. If confirmation is missing (wait 0,5 sec.) resend the packet again. If 5 tries in a row fail, client program exits with an error. Both data packets and confirmations can be lost, program must resolve this issue. Server can not print the same part of the file more than once. All metadata (everything apart from file content) send over the udp socket must be converted to int32_t type. You can assume that maximum allowed datagram (all data and metadata) size is 576B. Server can handle 5 concurrent transmissions at a time. If sixth client tries to send data it should be ignored. Server program takes port as its sole parameter, client takes address and port of the server as well as file name as its parameters.
What you need to know:
man 7 udp
man 3p sendto
man 3p recvfrom
man 3p recv
man 3p send
Solution l8-2_server.c
:
#include "l8_common.h"
#define BACKLOG 3
#define MAXBUF 576
#define MAXADDR 5
struct connections
{
int free;
int32_t chunkNo;
struct sockaddr_in addr;
};
int make_socket(int domain, int type)
{
int sock;
sock = socket(domain, type, 0);
if (sock < 0)
ERR("socket");
return sock;
}
int bind_inet_socket(uint16_t port, int type)
{
struct sockaddr_in addr;
int socketfd, t = 1;
socketfd = make_socket(PF_INET, type);
memset(&addr, 0, sizeof(struct sockaddr_in));
addr.sin_family = AF_INET;
addr.sin_port = htons(port);
addr.sin_addr.s_addr = htonl(INADDR_ANY);
if (setsockopt(socketfd, SOL_SOCKET, SO_REUSEADDR, &t, sizeof(t)))
ERR("setsockopt");
if (bind(socketfd, (struct sockaddr *)&addr, sizeof(addr)) < 0)
ERR("bind");
if (SOCK_STREAM == type)
if (listen(socketfd, BACKLOG) < 0)
ERR("listen");
return socketfd;
}
int findIndex(struct sockaddr_in addr, struct connections con[MAXADDR])
{
int i, empty = -1, pos = -1;
for (i = 0; i < MAXADDR; i++)
{
if (con[i].free)
empty = i;
else if (0 == memcmp(&addr, &(con[i].addr), sizeof(struct sockaddr_in)))
{
pos = i;
break;
}
}
if (-1 == pos && empty != -1)
{
con[empty].free = 0;
con[empty].chunkNo = 0;
con[empty].addr = addr;
pos = empty;
}
return pos;
}
void doServer(int fd)
{
struct sockaddr_in addr;
struct connections con[MAXADDR];
char buf[MAXBUF];
socklen_t size = sizeof(addr);
int i;
int32_t chunkNo, last;
for (i = 0; i < MAXADDR; i++)
con[i].free = 1;
for (;;)
{
if (TEMP_FAILURE_RETRY(recvfrom(fd, buf, MAXBUF, 0, &addr, &size) < 0))
ERR("read:");
if ((i = findIndex(addr, con)) >= 0)
{
chunkNo = ntohl(*((int32_t *)buf));
last = ntohl(*(((int32_t *)buf) + 1));
if (chunkNo > con[i].chunkNo + 1)
continue;
else if (chunkNo == con[i].chunkNo + 1)
{
if (last)
{
printf("Last Part %d\n%s\n", chunkNo, buf + 2 * sizeof(int32_t));
con[i].free = 1;
}
else
printf("Part %d\n%s\n", chunkNo, buf + 2 * sizeof(int32_t));
con[i].chunkNo++;
}
if (TEMP_FAILURE_RETRY(sendto(fd, buf, MAXBUF, 0, &addr, size)) < 0)
{
if (EPIPE == errno)
con[i].free = 1;
else
ERR("send:");
}
}
}
}
void usage(char *name) { fprintf(stderr, "USAGE: %s port\n", name); }
int main(int argc, char **argv)
{
int fd;
if (argc != 2)
{
usage(argv[0]);
return EXIT_FAILURE;
}
if (sethandler(SIG_IGN, SIGPIPE))
ERR("Seting SIGPIPE:");
fd = bind_inet_socket(atoi(argv[1]), SOCK_DGRAM);
doServer(fd);
if (TEMP_FAILURE_RETRY(close(fd)) < 0)
ERR("close");
fprintf(stderr, "Server has terminated.\n");
return EXIT_SUCCESS;
}
Solution l8-2_client.c
:
#include "l8_common.h"
#define MAXBUF 576
volatile sig_atomic_t last_signal = 0;
void sigalrm_handler(int sig) { last_signal = sig; }
int make_socket(void)
{
int sock;
sock = socket(PF_INET, SOCK_DGRAM, 0);
if (sock < 0)
ERR("socket");
return sock;
}
void usage(char *name) { fprintf(stderr, "USAGE: %s domain port file \n", name); }
void sendAndConfirm(int fd, struct sockaddr_in addr, char *buf1, char *buf2, ssize_t size)
{
struct itimerval ts;
if (TEMP_FAILURE_RETRY(sendto(fd, buf1, size, 0, &addr, sizeof(addr))) < 0)
ERR("sendto:");
memset(&ts, 0, sizeof(struct itimerval));
ts.it_value.tv_usec = 500000;
setitimer(ITIMER_REAL, &ts, NULL);
last_signal = 0;
while (recv(fd, buf2, size, 0) < 0)
{
if (EINTR != errno)
ERR("recv:");
if (SIGALRM == last_signal)
break;
}
}
void doClient(int fd, struct sockaddr_in addr, int file)
{
char buf[MAXBUF];
char buf2[MAXBUF];
int offset = 2 * sizeof(int32_t);
int32_t chunkNo = 0;
int32_t last = 0;
ssize_t size;
int counter;
do
{
if ((size = bulk_read(file, buf + offset, MAXBUF - offset)) < 0)
ERR("read from file:");
*((int32_t *)buf) = htonl(++chunkNo);
if (size < MAXBUF - offset)
{
last = 1;
memset(buf + offset + size, 0, MAXBUF - offset - size);
}
*(((int32_t *)buf) + 1) = htonl(last);
memset(buf2, 0, MAXBUF);
counter = 0;
do
{
counter++;
sendAndConfirm(fd, addr, buf, buf2, MAXBUF);
} while (*((int32_t *)buf2) != (int32_t)htonl(chunkNo) && counter <= 5);
if (*((int32_t *)buf2) != (int32_t)htonl(chunkNo) && counter > 5)
break;
} while (size == MAXBUF - offset);
}
int main(int argc, char **argv)
{
int fd, file;
struct sockaddr_in addr;
if (argc != 4)
{
usage(argv[0]);
return EXIT_FAILURE;
}
if (sethandler(SIG_IGN, SIGPIPE))
ERR("Seting SIGPIPE:");
if (sethandler(sigalrm_handler, SIGALRM))
ERR("Seting SIGALRM:");
if ((file = TEMP_FAILURE_RETRY(open(argv[3], O_RDONLY))) < 0)
ERR("open:");
fd = make_socket();
addr = make_address(argv[1], argv[2]);
doClient(fd, addr, file);
if (TEMP_FAILURE_RETRY(close(fd)) < 0)
ERR("close");
if (TEMP_FAILURE_RETRY(close(file)) < 0)
ERR("close");
return EXIT_SUCCESS;
}
There is no connection in UDP protocol, sockets send datagrams “ad hoc”. There is no listening socket. Losses, duplicates and reordering of datagrams are possible!
In this example you will find useful library candidate functions like: make_socket, bind_inet_socket as they have conflicting names with previously recommended functions, you have to rename them.
In this example connection context is more demanding. What constitutes the context here?Answer:
The number of packets send of specified file up to given moment is the context here - in other words struct connections.
What data is sent in datagram? What is the purpose of the metadata?Answer:
The datagram consists of (1. 32 bit file part number, (2) 32 bit information if this is the last part of the file, (3) the file part. Metadata helps to maintain the context, keep the track of the file being sent (1. and end the transmission (2).
Why and on what descriptors bulk_read and bulk_write are used? Should we extend this use on all the descriptors in the program?Answer:
Mentioned functions are used to restart read and write after interruption on IO. Notice that it is restarting both interruption types: before IO starts (EINTR) and the interruption in the middle of IO. Those functions are only used on files as datagrams are sent in atomic way (transfer can not be interrupted). In this program signals are handled in code parts that do not operate on files, still due to code portability bulk_ functions are used.
Do we expect broken connection in this program? Should we add checks in the code?Answer:
We do not have connection to break in UDP.
How findIndex works in server code: How addresses are compared? What byte order they are in? What will the function do if the address in new?Answer:
Addresses are compared in binary format without conversion to host byte order. We do not need to convert the byte order as we only compare the addresses, we do not display them. If new address is passed to this function it starts a new record for this address in the array (provide there is a space for it).
How this program deals with the duplicates of datagrams?Answer:
It keeps an array of active connections “struct connections” with the number of last transferred part. Duplicated parts are not processed.
How this program deals with the reordered datagrams, i.e. when program receives a part that is farther in the file than the next expected?Answer:
This can not happen in this program, client will not send such a part without having all the previous parts confirmed by the server.
How this program handles lost datagrams?Answer:
Client side re-transmission.
What will happen if the packet with confirmation from the server to the client gets lost?Answer:
The client will assume that the last part sent did not get to the server, it will send it again. Server will get the duplicate of the part that is already stored, it will not process it but it will send another confirmation.
What is in the confirmation?Answer:
Server sends back exactly the same data as it received.
How timeout on the server response is implemented?Answer:
In function sendAndConfirm 0.5s alarm is set (with setitimer) then the program waits for the confirmation on regular not restarted recv call. If signal is received, the recv gets interrupted and the code checks if the timeout triggered.
Why the program converts byte order only the part number and the last part marker and not the rest?Answer:
Only those two are send in binary integer format, the rest is a file part send as text string that does not require the conversion.
Analyze how 5 connection limit works, pay attention how “free” member in the connections structure works, how it is affected by the last part marker in the datagram?
Sample task #
Complete the sample exercises. You will have more time and starter code during the lab session, but completing the tasks below on your own means you are well prepared.
- Exercise 1 ~60 minutes
- Exercise 2 ~120 minutes
- Exercise 3 ~120 minutes
- Exercise 4 ~120 minutes