Please wait until the page is fully downloaded and then press the "Expand" button or the blue line numbers.

0015001 /*       this file contains the interface of the network software with the file
0015002  *       system.
The "sr" in "sr.c" stands for Send/Receive". As the comment above indicates, this file is the network service's interface with the file system.

All of the code in this file revolves around the array sr_fd_table[], which is described below.


sr_fd / sr_fd_table[] / sr_add_minor()


One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):

typedef struct sr_fd

{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:

eth0 DP8390 0 { default; };
psip1;

an element (i.e., a struct sr_fd) is configured for each of the following devices:

/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]

/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]




sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).



When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().


int srf_flags:

srf_flags is a combination of the following:

#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP

srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.

When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.


int srf_fd, srf_port:

srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:

if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4

For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.

Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).

srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.


sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:


The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.

mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:


The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).


After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().


0015003  *
0015004  * Copyright 1995 Philip Homburg
0015005  *
messages from the FS

In order to communicate with one another, the services (the file system (FS), the memory manager (MM), and the network service), the kernel, and the user processes send messages back and forth. These messages take the following form:


typedef struct { 

int m_source; /* who sent the message */
int m_type; /* what kind of message is it */
union {
mess_1 m_m1;
mess_2 m_m2;
mess_3 m_m3;
mess_4 m_m4;
mess_5 m_m5;
mess_6 m_m6;
} m_u;
} message;
typedef struct {int m1i1, m1i2, m1i3; char *m1p1, *m1p2, *m1p3;} mess_1;
typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;
typedef struct {int m3i1, m3i2; char *m3p1; char m3ca1[M3_STRING];} mess_3;
typedef struct {long m4l1, m4l2, m4l3, m4l4, m4l5;} mess_4;
typedef struct {char m5c1, m5c2; int m5i1, m5i2; long m5l1, m5l2, m5l3;}mess_5;
typedef struct {int m6i1, m6i2, m6i3; long m6l1; sighandler_t m6f1;} mess_6;

(In the names of the fields above, "i" stands for integer, "l" stands for long, and "p" stands for pointer.)

The message types (i.e., m_type) that the FS sends will be DEV_OPEN, DEV_CLOSE, DEV_READ, DEV_WRITE, DEV_IOCTL3, and DEV_CANCEL. For example, if a user process opens the /dev/udp file, the FS sends a message of type "DEV_OPEN" to the network service. The type of message that the network service always sends the FS is DEVICE_REPLY.

Recall that only a single field within a union is valid. When the file system (FS) receives a request (open, read, write, etc.) for one of the device files associated with the network service (/dev/udp, /dev/tcp, /dev/ip, etc.), the FS sends a message with an m_u of mess_2 to the network service. Repeated from above, the mess_2 typedef is declared as follows:

typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;


m_source m_type DEVICE (m2_i1) PROC_NR (m2_i2) COUNT or REQUEST (m2_i3) POSITION (m2_l1) TTY_FLAGS (m2_l2) ADDRESS (m2_p1)


Note: The field POSITION is marked gray because it is not used here and TTY_FLAGS is marked gray because it is always false. This denotes that it is in blocking mode (in other words, the process is waiting for a response).

The meaning of each of these fields is as follows:

DEVICE is the minor number of the device.
PROC_NR is the process number of the client.
COUNT is the number of bytes to transfer.
REQUEST is specific to ioctl operations. An example of an ioctl operation is NWIOSUDPOPT (NetWork IO Set UDP OPTions).
POSITION is the position in the device file.
TTY_FLAGS is zero (blocking mode).
ADDRESS is a pointer to a buffer. This buffer will either be read from or written to by the network service.

It is interesting to compare the fields above with the parameters for dev_io (from the file system):

PUBLIC int dev_io(op, nonblock, dev, pos, bytes, proc, buff)
int op; /* DEV_READ, DEV_WRITE, DEV_IOCTL, etc. */
int nonblock; /* TRUE if nonblocking op */
dev_t dev; /* major-minor device number */
off_t pos; /* byte position */
int bytes; /* how many bytes to transfer */
int proc; /* in whose address space is buff? */
char *buff; /* virtual address of the buffer */


The format of reply messages is different:


m_source DEVICE_REPLY proc nr status fd operation unused unused



The message queue is a linked list of mq_t struct's:

typedef struct mq

{
message mq_mess;
struct mq *mq_next;
int mq_allocated;
} mq_t;
Immediately after initialization, the message queue is as follows:



In preparation for receiving a message, mq_get() is called to remove one of the mq_t structs in the linked list from the free list of messages. This link will hold the next message received and will be passed to either the code that handles the messages from the file system, the asynchronous alarm task, or the ethernet task.


0015006  * The valid messages and their parameters are:
0015007  *
0015008  * Requests:
0015009  *
0015010  * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_MODE
0015011  * -------------------------------------------------------------
0015012  * | DEV_OPEN |minor dev | proc nr | fd | mode |
0015013  * |-------------+-----------+-----------+-----------+----------+
0015014  * | DEV_CLOSE |minor dev | proc nr | fd | |
0015015  * |-------------+-----------+-----------+-----------+----------+
0015016  *
0015017  * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_COUNT NDEV_BUFFER
0015018  * ---------------------------------------------------------------------------
0015019  * | DEV_READ |minor dev | proc nr | fd | count | buf ptr |
0015020  * |-------------+-----------+-----------+-----------+-----------+-----------|
0015021  * | DEV_WRITE |minor dev | proc nr | fd | count | buf ptr |
0015022  * |-------------+-----------+-----------+-----------+-----------+-----------|
0015023  *
0015024  * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_IOCTL NDEV_BUFFER
0015025  * ---------------------------------------------------------------------------
0015026  * | DEV_IOCTL3 |minor dev | proc nr | fd | command | buf ptr |
0015027  * |-------------+-----------+-----------+-----------+-----------+-----------|
0015028  *
0015029  * m_type NDEV_MINOR NDEV_PROC NDEV_REF NDEV_OPERATION
0015030  * -------------------------------------------------------------------|
0015031  * | DEV_CANCEL |minor dev | proc nr | fd | which operation|
0015032  * |-------------+-----------+-----------+-----------+----------------|
0015033  *
0015034  * Replies:
0015035  *
0015036  * m_type REP_PROC_NR REP_STATUS REP_REF REP_OPERATION
0015037  * ----------------------------------------------------------------------|
0015038  * | DEVICE_REPLY | proc nr | status | fd | which operation |
0015039  * |--------------+-------------+------------+---------+-----------------|
0015040  */
0015041 
0015042 #include "inet.h"
0015043 
0015044 #include <minix/callnr.h>
0015045 
0015046 #include "mq.h"
0015047 #include "proto.h"
0015048 #include "generic/type.h"
0015049 
0015050 #include "generic/assert.h"
0015051 #include "generic/buf.h"
0015052 #include "generic/sr.h"
0015053 
0015054 THIS_FILE
0015055 
0015056 #define FD_NR                     (16*IP_PORT_MAX)
IP_PORT_MAX is the maximum number of network devices (4 for 386) and is defined in inet/inet_config.h. There can be 2 ethernet network devices and two psip network devices.


0015057 
0015058 typedef struct sr_fd
0015059 {
0015060          int srf_flags;
0015061          int srf_fd;
0015062          int srf_port;
0015063          sr_open_t srf_open;
0015064          sr_close_t srf_close;
0015065          sr_write_t srf_write;
0015066          sr_read_t srf_read;
0015067          sr_ioctl_t srf_ioctl;
0015068          sr_cancel_t srf_cancel;
0015069          mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
0015070          mq_t *srf_read_q, *srf_read_q_tail;
0015071          mq_t *srf_write_q, *srf_write_q_tail;
0015072 } sr_fd_t;
sr_fd / sr_fd_table[] / sr_add_minor()

One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):

typedef struct sr_fd

{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:

eth0 DP8390 0 { default; };
psip1;

an element (i.e., a struct sr_fd) is configured for each of the following devices:

/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]

/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]




sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).



When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().


int srf_flags:

srf_flags is a combination of the following:

#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP

srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.

When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.


int srf_fd, srf_port:

srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:

if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4

For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.

Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).

srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.


sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:


The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.

mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:


The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).


After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().


0015073 
0015074 #define SFF_FLAGS        0x0F
0015075 #       define SFF_FREE              0x00
0015076 #       define SFF_MINOR       0x01
0015077 #       define SFF_INUSE       0x02
0015078 #       define SFF_BUSY              0x3C
0015079 #              define SFF_IOCTL_IP       0x04
0015080 #              define SFF_READ_IP       0x08
0015081 #              define SFF_WRITE_IP       0x10
0015082 #       define SFF_PENDING_REQ       0x30
0015083 #       define SFF_SUSPENDED       0x1C0
0015084 #              define SFF_IOCTL_SUSP       0x40
0015085 #              define SFF_READ_SUSP       0x80
0015086 #              define SFF_WRITE_SUSP       0x100
0015087 
The "SFF" above stands for "Send/Receive File Descriptor Flags". These flags determine the state of the file descriptor.


0015088 FORWARD _PROTOTYPE ( int sr_open, (message *m) );
0015089 FORWARD _PROTOTYPE ( void sr_close, (message *m) );
0015090 FORWARD _PROTOTYPE ( int sr_rwio, (mq_t *m) );
0015091 FORWARD _PROTOTYPE ( int sr_cancel, (message *m) );
0015092 FORWARD _PROTOTYPE ( void sr_reply, (mq_t *m, int reply, int can_enqueue) );
0015093 FORWARD _PROTOTYPE ( sr_fd_t *sr_getchannel, (int minor));
0015094 FORWARD _PROTOTYPE ( acc_t *sr_get_userdata, (int fd, vir_bytes offset,
0015095                                              vir_bytes count, int for_ioctl) );
0015096 FORWARD _PROTOTYPE ( int sr_put_userdata, (int fd, vir_bytes offset,
0015097                                           acc_t *data, int for_ioctl) );
0015098 FORWARD _PROTOTYPE ( int sr_repl_queue, (int proc, int ref, int operation) );
0015099 FORWARD _PROTOTYPE ( int walk_queue, (sr_fd_t *sr_fd, mq_t *q_head,
0015100                            mq_t **q_tail_ptr, int type, int proc_nr, int ref) );
0015101 FORWARD _PROTOTYPE ( void process_req_q, (mq_t *mq, mq_t *tail,
0015102                                                  mq_t **tail_ptr) );
0015103 FORWARD _PROTOTYPE ( int cp_u2b, (int proc, char *src, acc_t **var_acc_ptr,
0015104                                                          int size) );
0015105 FORWARD _PROTOTYPE ( int cp_b2u, (acc_t *acc_ptr, int proc, char *dest) );
0015106 
0015107 PRIVATE sr_fd_t sr_fd_table[FD_NR];
sr_fd / sr_fd_table[] / sr_add_minor()

One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):

typedef struct sr_fd

{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:

eth0 DP8390 0 { default; };
psip1;

an element (i.e., a struct sr_fd) is configured for each of the following devices:

/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]

/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]




sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).



When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().


int srf_flags:

srf_flags is a combination of the following:

#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP

srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.

When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.


int srf_fd, srf_port:

srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:

if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4

For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.

Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).

srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.


sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:


The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.

mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:


The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).


After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().


0015108 PRIVATE mq_t *repl_queue, *repl_queue_tail;
repl_queue / repl_queue_tail

repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.

For a detailed description of what circumstances lead to a deadlock, click here.


0015109 PRIVATE cpvec_t cpvec[CPVEC_NR];
0015110 
0015111 PUBLIC void sr_init()
sr_init()

sr_init() initializes sr_fd_table[] and repl_queue.


0015112 {
0015113 #if ZERO
0015114          int i;
0015115 
0015116          for (i=0; i<FD_NR; i++)
FD_NR for the 80386 is 64 (see line 15056). It is the number of elements in sr_fd_table[].

A few of these elements are filled in later by calls to sr_add_minor() (which is called by a client's initialization function - e.g., ip_init()).


0015117                   sr_fd_table[i].srf_flags= SFF_FREE;
0015118          repl_queue= NULL;
0015119 #endif
0015120 }
0015121 
0015122 PUBLIC void sr_rec(m)
0015123 mq_t *m;
sr_rec()

The network service receives messages from the file system (FS), the ethernet task, the clock task, and the asynchronous alarm task. sr_rec() is called upon the receipt of a message from the file system in the endless loop within main().

If there are any messages in repl_queue, sr_rec() first calls sr_repl_queue() to process the messages in repl_queue before calling sr_open() (for open requests), sr_close() (for close requests), sr_rwio() (for read, write, and io requests), or sr_cancel() (for cancel requests) to further process the message.

Note that "rec" in "sr_rec" stands for "receive".


0015124 {
0015125          int result;
0015126          int send_reply, free_mess;
0015127 
0015128          if (repl_queue)
Before any new messages can be processed, the messages in the queue of reply messages (repl_queue) must be sent by sr_repl_queue() (see lines 15132 and 15140). If the message is a cancellation message (NW_CANCEL), sr_repl_queue() sends the cancellation message after all the other reply messages have been sent.

All reply messages are sent to the file system.



repl_queue / repl_queue_tail


repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.

For a detailed description of what circumstances lead to a deadlock, click here.


0015129          {
0015130                   if (m->mq_mess.m_type == NW_CANCEL)
0015131                   {
0015132                            result= sr_repl_queue(m->mq_mess.PROC_NR, 0, 0);
sr_repl_queue()

If repl_queue is not empty, sr_repl_queue(proc, ref, operation) is called to send all messages (which will be REVIVE messages) in repl_queue to the file system. If called with proc, sr_repl_queue()'s first parameter, set to ANY (#define'd in include/minix/com.h - this value is not within allowable range of process numbers), sr_repl_queue() goes through repl_queue and sends each reply message to the appropriate process. If called with proc set to a specific process number (which is the case when sr_rec() is processing an NW_CANCEL message), sr_repl_queue() goes through repl_queue and sends every REVIVE message except for any messages destined for the process specified by proc (if any exist). sr_repl_queue() finally sends the message destined to the process specified by proc after all other messages have been sent. While the file system is still interested in the REVIVE message, the priority goes to other replies.

In addition to sending all the reply messages queued in repl_queue, sr_repl_queue() frees the messages in repl_queue for later use and sets repl_queue to NULL.

A return value of 1 indicates that a message in repl_queue was canceled. A return value of 0 indicates that no messages were canceled.

The second and third parameters (ref and operation) are never used in sr_repl_queue().


0015133                            if (result)
A REVIVE message destined to the same process as the origin of the NW_CANCEL message just received was found in repl_queue. Since the NW_CANCEL message arrived too late, simply free the NW_CANCEL message and return.


0015134                            {
0015135                                     mq_free(m);
mq_free()

mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.



Click here for a detailed description of mq_list[] and mq_freelist.


0015136                                     return;       /* canceled request in queue */
0015137                            }
0015138                   }
0015139                   else
0015140                            sr_repl_queue(ANY, 0, 0);
sr_repl_queue()

If repl_queue is not empty, sr_repl_queue(proc, ref, operation) is called to send all messages (which will be REVIVE messages) in repl_queue to the file system. If called with proc, sr_repl_queue()'s first parameter, set to ANY (#define'd in include/minix/com.h - this value is not within allowable range of process numbers), sr_repl_queue() goes through repl_queue and sends each reply message to the appropriate process. If called with proc set to a specific process number (which is the case when sr_rec() is processing an NW_CANCEL message), sr_repl_queue() goes through repl_queue and sends every REVIVE message except for any messages destined for the process specified by proc (if any exist). sr_repl_queue() finally sends the message destined to the process specified by proc after all other messages have been sent. While the file system is still interested in the REVIVE message, the priority goes to other replies.

In addition to sending all the reply messages queued in repl_queue, sr_repl_queue() frees the messages in repl_queue for later use and sets repl_queue to NULL.

A return value of 1 indicates that a message in repl_queue was canceled. A return value of 0 indicates that no messages were canceled.

The second and third parameters (ref and operation) are never used in sr_repl_queue().


0015141          }
0015142 
0015143          switch (m->mq_mess.m_type)
Hand the message off to the request-specific function.


messages from the FS


In order to communicate with one another, the services (the file system (FS), the memory manager (MM), and the network service), the kernel, and the user processes send messages back and forth. These messages take the following form:


typedef struct { 

int m_source; /* who sent the message */
int m_type; /* what kind of message is it */
union {
mess_1 m_m1;
mess_2 m_m2;
mess_3 m_m3;
mess_4 m_m4;
mess_5 m_m5;
mess_6 m_m6;
} m_u;
} message;
typedef struct {int m1i1, m1i2, m1i3; char *m1p1, *m1p2, *m1p3;} mess_1;
typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;
typedef struct {int m3i1, m3i2; char *m3p1; char m3ca1[M3_STRING];} mess_3;
typedef struct {long m4l1, m4l2, m4l3, m4l4, m4l5;} mess_4;
typedef struct {char m5c1, m5c2; int m5i1, m5i2; long m5l1, m5l2, m5l3;}mess_5;
typedef struct {int m6i1, m6i2, m6i3; long m6l1; sighandler_t m6f1;} mess_6;

(In the names of the fields above, "i" stands for integer, "l" stands for long, and "p" stands for pointer.)

The message types (i.e., m_type) that the FS sends will be DEV_OPEN, DEV_CLOSE, DEV_READ, DEV_WRITE, DEV_IOCTL3, and DEV_CANCEL. For example, if a user process opens the /dev/udp file, the FS sends a message of type "DEV_OPEN" to the network service. The type of message that the network service always sends the FS is DEVICE_REPLY.

Recall that only a single field within a union is valid. When the file system (FS) receives a request (open, read, write, etc.) for one of the device files associated with the network service (/dev/udp, /dev/tcp, /dev/ip, etc.), the FS sends a message with an m_u of mess_2 to the network service. Repeated from above, the mess_2 typedef is declared as follows:

typedef struct {int m2i1, m2i2, m2i3; long m2l1, m2l2; char *m2p1;} mess_2;


m_source m_type DEVICE (m2_i1) PROC_NR (m2_i2) COUNT or REQUEST (m2_i3) POSITION (m2_l1) TTY_FLAGS (m2_l2) ADDRESS (m2_p1)


Note: The field POSITION is marked gray because it is not used here and TTY_FLAGS is marked gray because it is always false. This denotes that it is in blocking mode (in other words, the process is waiting for a response).

The meaning of each of these fields is as follows:

DEVICE is the minor number of the device.
PROC_NR is the process number of the client.
COUNT is the number of bytes to transfer.
REQUEST is specific to ioctl operations. An example of an ioctl operation is NWIOSUDPOPT (NetWork IO Set UDP OPTions).
POSITION is the position in the device file.
TTY_FLAGS is zero (blocking mode).
ADDRESS is a pointer to a buffer. This buffer will either be read from or written to by the network service.

It is interesting to compare the fields above with the parameters for dev_io (from the file system):

PUBLIC int dev_io(op, nonblock, dev, pos, bytes, proc, buff)
int op; /* DEV_READ, DEV_WRITE, DEV_IOCTL, etc. */
int nonblock; /* TRUE if nonblocking op */
dev_t dev; /* major-minor device number */
off_t pos; /* byte position */
int bytes; /* how many bytes to transfer */
int proc; /* in whose address space is buff? */
char *buff; /* virtual address of the buffer */


The format of reply messages is different:


m_source DEVICE_REPLY proc nr status fd operation unused unused



The message queue is a linked list of mq_t struct's:

typedef struct mq

{
message mq_mess;
struct mq *mq_next;
int mq_allocated;
} mq_t;
Immediately after initialization, the message queue is as follows:



In preparation for receiving a message, mq_get() is called to remove one of the mq_t structs in the linked list from the free list of messages. This link will hold the next message received and will be passed to either the code that handles the messages from the file system, the asynchronous alarm task, or the ethernet task.


0015144          {
0015145          case DEV_OPEN:
0015146                   result= sr_open(&m->mq_mess);
sr_open()

sr_open() examines the minor number of the message (the DEVICE field, more specifically; this will reflect which file was opened - for example, if the DEVICE field is equal to 4, /dev/udp0 was opened) and copies the sr_fd_table[] element whose index corresponds to this minor number to the first unused entry of the table (thereby opening a channel). sr_open() sets the element's flag to SFF_INUSE and then calls the protocol specific open function (e.g., udp_open()) for the channel (a pointer to this function is found in the srf_open field of the element). This function's return value sets the element's file descriptor field (srf_fd). Finally sr_open() returns the index of the new element.



0015147                   send_reply= 1;
0015148                   free_mess= 1;
Read, write, and ioctl operations free their own messages. Open, close, and cancel operations do not free messages. Therefore, free_mess equals zero for read, write, and ioctl operations and free_mess equals one for open, close, and cancel operations.

As an example of a read/write/ioctl operation freeing messages, consider the following (straight-forward) scenario where there are no messages in the read message queue of a file descriptor (i.e., srf_read_q, srf_read_q_tail == null) but data was received (perhaps by the ethernet task) and the file descriptor was opened by the udp client. If a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, data (which was previously received) is transferred to the user process. The second time that sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free() to free the message.


Below is a more complex example of message releasing. In this example, no read messages are waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.

A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not initially set. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then calls sr_reply() to send a REVIVE message back to the file system. In other words, sr_reply() instructs the FS to unblock the process requesting the read.



A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.



A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.



Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_userdata() is called, it calls sr_reply(), which sets the read queue head to null.



After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.



If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed.

The same principle is also true for send_reply. If the operation itself sends a reply (by calling sr_reply()), there is no need to call it. Open, close, and cancel operations do not send replies and, therefore, a reply (see line 15179) for these operations is always necessary. Read, write, and ioctl send replies for themselves if the operation is successful. However, if the read/write/ioctl operation is not successful (i.e., the operation is suspended), a reply must be sent.


0015149                   break;
0015150          case DEV_CLOSE:
0015151                   sr_close(&m->mq_mess);
sr_close()

sr_close() closes a channel by calling the channel's protocol-specific close function (e.g., udp_close()) and marking the channel's corresponding element in sr_fd_table[] as free.


0015152                   result= OK;
0015153                   send_reply= 1;
0015154                   free_mess= 1;
0015155                   break;
0015156          case DEV_READ:
0015157          case DEV_WRITE:
0015158          case DEV_IOCTL:
0015159                   result= sr_rwio(m);
sr_rwio()

sr_rwio() is either called by sr_rec() upon receipt of a read, write, or ioctl request or by process_req_q() to handle queued messages. The handling of a read, write, or ioctl request is a complex operation that is best explained by an example. We will analyze two different read requests, one with data ready to be delivered to user processes and one with no data ready to be delivered.

SCENARIO 1: There is data ready to be delivered to the user process and there are no messages in the read queue (i.e., srf_read_q, srf_read_q_tail == null). The file descriptor was opened by the udp client.

When a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, the data is transferred to the user process. The second time sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free(), freeing the message.

SCENARIO 2: There are no read messages waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.

A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not raised. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then sends a REVIVE message back to the file system (in other words, it instructs the FS to revive the process requesting the read). After this, the sr file descriptor is shown:



A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.



A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.



Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_data() is called, it calls sr_reply(), which sets the read queue head to null.



After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests (by calling sr_rwio()) if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.



If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed. Note that the sr_read_q field in the right-hand figure above was set by sr_rwio() before returning.


0015160                   assert(result == OK || result == SUSPEND);
0015161                   send_reply= (result == SUSPEND);
0015162                   free_mess= 0;
0015163                   break;
0015164          case CANCEL:
0015165                   result= sr_cancel(&m->mq_mess);
sr_cancel()

Unlike sr_open(), sr_close(), and sr_rwio(), sr_cancel() does not call the protocol specific cancel function directly. Instead, sr_cancel() calls walk_queue(), which walks the message queue for the given channel (sr_fd) for each of the queues (read, write, and ioctl) until it finds a message with the same process number.

If the message is the first message in the queue, walk_queue() calls the protocol specific cancel function (see line 15422). If a message is found but it is not the first message in the queue, walk_queue() removes the message from the queue and returns EINTR. If a message is not found, walk_queue() returns EAGAIN (which is a problem).


0015166                   assert(result == OK || result == EINTR);
0015167                   send_reply= (result == EINTR);
0015168                   free_mess= 1;
0015169                   m->mq_mess.m_type= 0;
0015170                   break;
0015171 #if !CRAMPED
0015172          default:
0015173                   ip_panic(("unknown message, from %d, type %d",
0015174                                     m->mq_mess.m_source, m->mq_mess.m_type));
0015175 #endif
0015176          }
0015177          if (send_reply)
0015178          {
0015179                   sr_reply(m, result, FALSE);
sr_reply()

sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().

If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().

However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().

It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).

If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.

If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.


0015180          }
0015181          if (free_mess)
0015182                   mq_free(m);
mq_free()

mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.



Click here for a detailed description of mq_list[] and mq_freelist.


0015183 }
0015184 
0015185 PUBLIC void sr_add_minor(minor, port, openf, closef, readf, writef,
0015186          ioctlf, cancelf)
0015187 int minor;
0015188 int port;
0015189 sr_open_t openf;
0015190 sr_close_t closef;
0015191 sr_read_t readf;
0015192 sr_write_t writef;
0015193 sr_ioctl_t ioctlf;
0015194 sr_cancel_t cancelf;
sr_fd / sr_fd_table[] / sr_add_minor()

One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):

typedef struct sr_fd

{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:

eth0 DP8390 0 { default; };
psip1;

an element (i.e., a struct sr_fd) is configured for each of the following devices:

/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]

/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]




sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).



When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().


int srf_flags:

srf_flags is a combination of the following:

#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP

srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.

When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.


int srf_fd, srf_port:

srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:

if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4

For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.

Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).

srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.


sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:


The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.

mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:


The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).


After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().


0015195 {
0015196          sr_fd_t *sr_fd;
0015197 
sr_add_minor() checks to make sure that the minor device number is an acceptable value (i.e., less than 64 - there can be 16 minor devices for a maximum of 4 interfaces) and also verifies that the element is not already in use (line 15202).

sr_add_minor() then marks the element as used (line15204) and fills in all the fields of the element (for example, sr_fd_table[4] is filled in for /dev/udp0). The various function fields (e.g., the closef field - see line 15207) are filled in with the protocol-specific function (e.g., udp_close()).

The port number is the position of each device for a given protocol in the /etc/inet.conf file. For example, if there are two ethernet devices specified in the inet.conf file, the first ethernet device will have a port number of 0 and the second ethernet device will have a port number of 1. (See line 10086 in mnx_eth.c.)


0015198          assert (minor>=0 && minor<FD_NR);
0015199 
0015200          sr_fd= &sr_fd_table[minor];
0015201 
0015202          assert(!(sr_fd->srf_flags & SFF_INUSE));
0015203 
0015204          sr_fd->srf_flags= SFF_INUSE | SFF_MINOR;
0015205          sr_fd->srf_port= port;
0015206          sr_fd->srf_open= openf;
0015207          sr_fd->srf_close= closef;
0015208          sr_fd->srf_write= writef;
0015209          sr_fd->srf_read= readf;
0015210          sr_fd->srf_ioctl= ioctlf;
0015211          sr_fd->srf_cancel= cancelf;
0015212 }
0015213 
0015214 PRIVATE int sr_open(m)
0015215 message *m;
sr_open()

sr_open() examines the minor number of the message (the DEVICE field, more specifically; this will reflect which file was opened - for example, if the DEVICE field is equal to 4, /dev/udp0 was opened) and copies the sr_fd_table[] element whose index corresponds to this minor number to the first unused entry of the table (thereby opening a channel). sr_open() sets the element's flag to SFF_INUSE and then calls the protocol specific open function (e.g., udp_open()) for the channel (a pointer to this function is found in the srf_open field of the element). This function's return value sets the element's file descriptor field (srf_fd). Finally sr_open() returns the index of the new element.



0015216 {
0015217          sr_fd_t *sr_fd;
0015218 
0015219          int minor= m->DEVICE;
0015220          int i, fd;
0015221 
0015222          if (minor<0 || minor>FD_NR)
Verify that the minor device number is between 0 and 63 (for the 80386, there can be 64 channels).

EINVAL (defined in include/errno.h) stands for "invalid argument".


0015223          {
0015224                   DBLOCK(1, printf("replying EINVAL\n"));
0015225                   return EINVAL;
0015226          }
0015227          if (!(sr_fd_table[minor].srf_flags & SFF_MINOR))
Verify that the sr_fd_table[] element refers to a minor device.


sr_fd / sr_fd_table[] / sr_add_minor()


One of the most important data arrays in the network service is sr_fd_table[], an array of 64 struct sr_fd's. Each sr_fd element in sr_fd_table[] corresponds to either a device or an opened file descriptor to a device (i.e., a "channel"):

typedef struct sr_fd

{
int srf_flags;
int srf_fd;
int srf_port;
sr_open_t srf_open;
sr_close_t srf_close;
sr_write_t srf_write;
sr_read_t srf_read;
sr_ioctl_t srf_ioctl;
sr_cancel_t srf_cancel;
mq_t *srf_ioctl_q, *srf_ioctl_q_tail;
mq_t *srf_read_q, *srf_read_q_tail;
mq_t *srf_write_q, *srf_write_q_tail;
} sr_fd_t;
For each device (e.g., /dev/udp0), an element in sr_fd_table[] is configured by sr_add_minor(). For example, for the following inet.conf file:

eth0 DP8390 0 { default; };
psip1;

an element (i.e., a struct sr_fd) is configured for each of the following devices:

/dev/eth0 sr_fd_table[1]
/dev/ip0 sr_fd_table[2]
/dev/tcp0 sr_fd_table[3]
/dev/udp0 sr_fd_table[4]

/dev/psip1 sr_fd_table[17]
/dev/ip1 sr_fd_table[18]
/dev/tcp1 sr_fd_table[19]
/dev/udp1 sr_fd_table[20]




sr_add_minor() is called in the initialization routines for the various protocols: mnx_eth.c (osdep_eth_init()), psip.c (psip_enable()), ip.c (ip_init()), tcp.c (tcp_init()), and udp.c (udp_init()).



When a device file (e.g., /dev/udp0) is opened by a process, the element that corresponds to the device is copied to an element that is currently unoccupied (see sr_open()). In this way, a "channel" is opened. Using this technique, a channel can be opened, closed, and manipulated without affecting the elements of the descriptors initially set by sr_add_minor().


int srf_flags:

srf_flags is a combination of the following:

#define SFF_FREE 0x00
#define SFF_MINOR 0x01
#define SFF_INUSE 0x02
#define SFF_BUSY 0x3C
#define SFF_IOCTL_IP 0x04
#define SFF_READ_IP 0x08
#define SFF_WRITE_IP 0x10
#define SFF_PENDING_REQ 0x30
#define SFF_SUSPENDED 0x1C0
#define SFF_IOCTL_SUSP 0x40
#define SFF_READ_SUSP 0x80
#define SFF_WRITE_SUSP

srf_flags is initialized to SFF_FREE for each element in sr_fd_table[]. If the channel corresponds to a device file, srf_flags is set to SFF_INUSE | SFF_MINOR. If the channel does not correspond to a device file, srf_flags is set simply to SFF_INUSE.

When a request comes in for a read, write, or ioctl operation and the network service is not already processing another request for the same operation, srf_flags is set to SFF_READ_IP, SFF_WRITE_IP, or SFF_IOCTL_IP. However, if an operation is attempted but the underlying protocol is still processing a previous request of the same nature (e.g., udp_write()), the appropriate flag (SFF_IOCTL_SUSP, SFF_READ_SUSP, or SFF_WRITE_SUSP) in srf_flags is set.


int srf_fd, srf_port:

srf_fd and srf_port are both set by sr_add_minor(). For the channels in srf_fd_table[] that correspond to the device files (e.g., /dev/udp0), srf_fd is set to the minor device number of the device. For example, if /dev/udp0 is added to sr_fd_table[] and the interface number of the device file is 0 (see comments for ip_conf[]), then the minor device number is:

if2minor(ifno, dev) = ((0)*16 + UDP_DEV = 0 + 4 = 4

For the channels in srf_fd_table[] that do not correspond to a device file, srf_fd is the file descriptor for the appropriate protocol. For example, if the file system requests that a udp channel be opened, srf_open is dereferenced and udp_open() is called. udp_open() opens a udp file descriptor and returns the index of the corresponding element in udp_fd_table[]. srf_fd is set to the index of this element.

Later, when the file system requests a read or a write on the open channel, srf_fd is passed into the protocol-specific read or write function (e.g., udp_read()), allowing the protocol-specific function to locate the appropriate file descriptor (e.g., udp file descriptor).

srf_port is more straight-forward. srf_port is the index in the protocol's port table. For example, if a system has two udp device files (/dev/udp0 and /dev/udp1), udp_port_table[] will have two entries, 0 and 1. Therefore, srf_port for the entry in sr_fd_table[] that corresponds to /dev/udp0 will be 0 and srf_port for the entry that corresponds to /dev/udp1 will be 1.


sr_open_t srf_open:
sr_close_t srf_close:
sr_write_t srf_write:
sr_read_t srf_read:
sr_ioctl_t srf_ioctl:
sr_cancel_t srf_cancel:


The fields above are all protocol-specific functions and and are all set by sr_add_minor(). For example, when sr_add_minor() is called by udp_init(), srf_open, srf_close, srf_write, srf_read, srf_ioctl, and srf_cancel are set to the pointers of the functions udp_open(), udp_close(), udp_write(), udp_read(), udp_ioctl(), and udp_cancel(). Later, when the file system makes a request to the network service, these functions will be called. For example, if the file system requests that data is written to a channel, srf_write is dereferenced and, if the channel is a udp channel, udp_write() is called.

mq_t *srf_ioctl_q, *srf_ioctl_q_tail:
mq_t *srf_read_q, *srf_read_q_tail:
mq_t *srf_write_q, *srf_write_q_tail:


The fields above are linked lists of ioctl, read, and write messages waiting to be processed. When a message requesting an ioctl, read, or write operation is received, the message is placed at the end of the linked list (unless there are no previous messages of this type that have not already been processed).


After the initialization of the network service, sr_rec() is called upon receipt of messages from the file system in the endless loop within main(). sr_rec() then calls a function to handle the specific request. For open requests, sr_rec() calls sr_open(); for read, write, and io requests, sr_rec() calls sr_rwio(); for close requests, sr_rec() calls sr_close(); for cancel requests, sr_rec() calls sr_cancel().


0015228          {
Verify that the sr_fd_table[] element refers to a minor device).

ENXIO (defined in include/errno.h) stands for "no such device or address" (in our case, "no such device" is relevant).


0015229                   DBLOCK(1, printf("replying ENXIO\n"));
0015230                   return ENXIO;
0015231          }
0015232          for (i=0; i<FD_NR && (sr_fd_table[i].srf_flags & SFF_INUSE); i++);
Find the first sr_fd_table[] element that is not in use. i is the index of that element and is used in the following lines.


0015233 
0015234          if (i>=FD_NR)
0015235          {
0015236                   DBLOCK(1, printf("replying ENFILE\n"));
ENFILE (defined in include/errno.h) indicates that there are "too many open files in system".


0015237                   return ENFILE;
0015238          }
0015239 
0015240          sr_fd= &sr_fd_table[i];
0015241          *sr_fd= sr_fd_table[minor];
The sr_fd struct with the minor device number minor (for example, sr_fd_table[4] for /dev/udp0) is copied to the open slot (i.e., sr_fd_table[i]).


0015242          sr_fd->srf_flags= SFF_INUSE;
The element is marked as in use. Note that the element is not marked as a minor device (SFF_MINOR). Also note that srf_port is the same as the minor device just copied. For example, if a process opens up /dev/upd1, srf_port will be 1 (see comment for line 15002).

When the device is closed, srf_flags is set to SFF_FREE (see line 15267).


0015243          fd= (*sr_fd->srf_open)(sr_fd->srf_port, i, sr_get_userdata,
0015244                   sr_put_userdata, 0);
Call the protocol-specific open function. For example, the open function for the udp protocol is udp_open() and for tcp is tcp_open(). These functions return a protocol-specific file descriptor which, like the sr file descriptors (sr_fd), are simply elements in a table (e.g., udp_fd_table[]). srf_fd for the new channel is set to this value (see line 15251).



udp_open()


udp_open(port, srfd, get_getuserdata, put_userdata, put_pkt), called from sr_open(), finds the first unused element in udp_fd_table[], configures this udp file descriptor with the arguments passed in, and returns the index of the (newly configured) element within udp_fd_table[].

The parameters are explained below:

When a process opens one of the udp devices (e.g., /dev/udp), sr_open() claims an unused element in sr_fd_table[] and copies the element of sr_fd_table[] that corresponds to the device to this unused element. This element number is then passed in as the second argument (srfd). Later, when the process wishes to read or write or perform ioctl on this open file descriptor, it includes this sr file descriptor in the message that it sends the file system.

For the following example inet.conf file:

eth0 DP8390 0 { default; };
psip1;

there are 2 ports. Port 0 corresponds to the ethernet device and port 1 corresponds to the psip port.

get_userdata is a pointer to the function that gets the data from the process. This function will always be sr_get_userdata().

sr_put_userdata is a pointer to the function that copies data into the user process. This function will always be sr_put_userdata().

The put_pkt parameter is not used in this function.


0015245          if (fd<0)
0015246          {
0015247                   sr_fd->srf_flags= SFF_FREE;
0015248                   DBLOCK(1, printf("replying %d\n", fd));
0015249                   return fd;
0015250          }
0015251          sr_fd->srf_fd= fd;
This is very important. As described in the sr_fd general comment, this allows the code to find the corresponding entry in the protocol-specific file descriptor table (e.g., udp_fd_table[]).


0015252          return i;
Return the index of the newly opened channel. This value is ultimately sent back to the calling process (see line 15179).


0015253 }
0015254 
0015255 PRIVATE void sr_close(m)
0015256 message *m;
sr_close()

sr_close() closes a channel by calling the channel's protocol-specific close function (e.g., udp_close()) and marking the channel's corresponding element in sr_fd_table[] as free.


0015257 {
0015258          sr_fd_t *sr_fd;
0015259 
0015260          sr_fd= sr_getchannel(m->DEVICE);
sr_getchannel()

For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.

Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].


0015261          assert (sr_fd);
0015262 
0015263          assert (!(sr_fd->srf_flags & SFF_BUSY));
0015264 
0015265          assert (!(sr_fd->srf_flags & SFF_MINOR));
0015266          (*sr_fd->srf_close)(sr_fd->srf_fd);
Call the protocol-specific close function for the given channel. For example, if the channel is a udp channel, udp_close() is called. Note that srf_close was set by sr_add_minor().

Mark the slot as free so that another channel can later claim the slot.


0015267          sr_fd->srf_flags= SFF_FREE;
0015268 }
0015269 
0015270 PRIVATE int sr_rwio(m)
0015271 mq_t *m;
sr_rwio()

sr_rwio() is either called by sr_rec() upon receipt of a read, write, or ioctl request or by process_req_q() to handle queued messages. The handling of a read, write, or ioctl request is a complex operation that is best explained by an example. We will analyze two different read requests, one with data ready to be delivered to user processes and one with no data ready to be delivered.

SCENARIO 1: There is data ready to be delivered to the user process and there are no messages in the read queue (i.e., srf_read_q, srf_read_q_tail == null). The file descriptor was opened by the udp client.

When a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, the data is transferred to the user process. The second time sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free(), freeing the message.

SCENARIO 2: There are no read messages waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.

A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not raised. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then sends a REVIVE message back to the file system (in other words, it instructs the FS to revive the process requesting the read). After this, the sr file descriptor is shown:



A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.



A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.



Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_data() is called, it calls sr_reply(), which sets the read queue head to null.



After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests (by calling sr_rwio()) if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.



If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed. Note that the sr_read_q field in the right-hand figure above was set by sr_rwio() before returning.


0015272 {
0015273          sr_fd_t *sr_fd;
0015274          mq_t **q_head_ptr, **q_tail_ptr;
0015275          int ip_flag, susp_flag;
0015276          int r;
0015277          ioreq_t request;
request is only used for ioctl requests (see line 15334).

ioreq_t is declared in net/inet.h as:

typedef int ioreq_t;


0015278          size_t size;
0015279 
0015280          sr_fd= sr_getchannel(m->mq_mess.DEVICE);
sr_getchannel()

For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.

Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].


0015281          assert (sr_fd);
0015282 
For each message type, use the appropriate message queue and the appropriate flags. This queue and these flags will be used in the block that begins on line 15309.


0015283          switch(m->mq_mess.m_type)
0015284          {
0015285          case DEV_READ:
0015286                   q_head_ptr= &sr_fd->srf_read_q;
0015287                   q_tail_ptr= &sr_fd->srf_read_q_tail;
0015288                   ip_flag= SFF_READ_IP;
0015289                   susp_flag= SFF_READ_SUSP;
0015290                   break;
0015291          case DEV_WRITE:
0015292                   q_head_ptr= &sr_fd->srf_write_q;
0015293                   q_tail_ptr= &sr_fd->srf_write_q_tail;
0015294                   ip_flag= SFF_WRITE_IP;
0015295                   susp_flag= SFF_WRITE_SUSP;
0015296                   break;
0015297          case DEV_IOCTL:
0015298                   q_head_ptr= &sr_fd->srf_ioctl_q;
0015299                   q_tail_ptr= &sr_fd->srf_ioctl_q_tail;
0015300                   ip_flag= SFF_IOCTL_IP;
0015301                   susp_flag= SFF_IOCTL_SUSP;
0015302                   break;
0015303 #if !CRAMPED
0015304          default:
0015305                   ip_panic(("illegal case entry"));
0015306 #endif
0015307          }
0015308 
0015309          if (sr_fd->srf_flags & ip_flag)
If the channel is already servicing a similar request, put the message at the tail of the appropriate message queue and return SUSPEND.


0015310          {
0015311                   assert(sr_fd->srf_flags & susp_flag);
0015312                   assert(*q_head_ptr);
0015313 
0015314                   (*q_tail_ptr)->mq_next= m;
0015315                   *q_tail_ptr= m;
0015316                   return SUSPEND;
0015317          }
0015318          assert(!*q_head_ptr);
0015319 
0015320          *q_tail_ptr= *q_head_ptr= m;
If this point in the code is reached, the relevant message queue was empty. Therefore, the head and the tail of the relevant message queue is set to the message just received.

For a read request, set srf_flags to SFF_READ_IP. For a write request, set srf_flags to SFF_WRITE_IP. For an ioctl request, set srf_flags to SFF_IOCTL_IP (see lines 15288, 15294, and 15300).


0015321          sr_fd->srf_flags |= ip_flag;
0015322 
0015323          switch(m->mq_mess.m_type)
Call the appropriate function. For example, for a udp channel, call udp_read(), udp_write(), or udp_ioctl().

sr_fd->srf_fd is the corresponding slot in in the protocol-specific file descriptor table (e.g., udp_fd_table[]).

m->mq_mess.COUNT is the number of bytes requested for the read or write operation.

It is very important to understand that if udp_read(), udp_write(), and udp_ioctl() are able to satisfy their requests, they will call either sr_put_userdata() (for udp_read()) or sr_get_userdata() (for udp_write() and udp_ioctl()). Understanding the role that sr_put_userdata() and sr_get_userdata() play is probably the largest challenge in understanding the sr.c file.


0015324          {
0015325          case DEV_READ:
0015326                   r= (*sr_fd->srf_read)(sr_fd->srf_fd,
0015327                            m->mq_mess.COUNT);
0015328                   break;
0015329          case DEV_WRITE:
0015330                   r= (*sr_fd->srf_write)(sr_fd->srf_fd,
0015331                            m->mq_mess.COUNT);
0015332                   break;
0015333          case DEV_IOCTL:
0015334                   request= m->mq_mess.REQUEST;
0015335 #ifdef _IOCPARM_MASK
_IOCPARM_MASK is #define'd in include/sys/ioctl.h as:

#define _IOCPARM_MASK 0x1FFF

/* Ioctls have the command encoded in the low-order word, and the size
* of the parameter in the high-order word. The 3 high bits of the high-
* order word are used to encode the in/out/void status of the parameter.
*/

size is used to determine if request contains a valid ioctl request.


An example of valid request is NWIOSUDPOPT, which is found in include/sys/ioctl.h:

#define NWIOSUDPOPT _IOW('n', 64, struct nwio_udpopt)

#define _IOW(x,y,t) ((x << 8) | y | ((sizeof(t) & _IOCPARM_MASK) << 16) |\
_IOC_IN)

#define _IOCPARM_MASK 0x1FFF
#define _IOC_IN 0x40000000

Therefore,

NWIOSUDOPT = _IOW('n', 64, struct nwio_udpopt) = _IOW(0x6E, 0x40, struct nwio_udpopt) = (0x6E<<8 | 0x40 | (sizeof(nwio_udpopt) & 0x1FFF) << 16 | 0x4000000)
= ((0x6E<<8 | 0x40 | (20 & 0x1FFF) << 16 | 0x4000000)
= 0x40146E40


0015336                   size= (request >> 16) & _IOCPARM_MASK;
In the example above:

size = (0x40146E40 >> 16) & 0x1FFF = 0x14 = 20

Note that 20 = sizeof(nwio_udpopt)


0015337                   if (size>MAX_IOCTL_S)
MAX_IOCTL_S is declared in inet/generic/sr.h as:

#define MAX_IOCTL_S 512

This ensures that the struct passed from the user process to the network service is not greater than 512 bytes. For the NWIOSUDOPT request, size is 20 bytes and therefore acceptable.


0015338                   {
0015339                            DBLOCK(1, printf("replying EINVAL\n"));
0015340                            r= sr_put_userdata(sr_fd-sr_fd_table, EINVAL,
0015341                                     NULL, 1);
Send an error message (EINVAL stands for "Error INVALid") to the user process.

sr_fd-sr_fd_table is the index of the channel.


0015342                            assert(r == OK);
0015343                            return OK;
0015344                   }
0015345 #endif
0015346                   r= (*sr_fd->srf_ioctl)(sr_fd->srf_fd, request);
Call the channel's protocol-specific ioctl function (e.g., udp_ioctl()). sr_fd->srf_fd is the slot in the underlying protocol's file descriptor table (e.g., udp_fd_table[]).


0015347                   break;
0015348 #if !CRAMPED
0015349          default:
0015350                   ip_panic(("illegal case entry"));
0015351 #endif
0015352          }
0015353 
0015354          assert(r == OK || r == SUSPEND ||
0015355                   (printf("r= %d\n", r), 0));
0015356          if (r == SUSPEND)
0015357                   sr_fd->srf_flags |= susp_flag;
If the result of the operation is SUSPEND, then set the specific flag (e.g., SFF_READ_SUSP) and return SUSPEND. SUSPEND is #define'd in /include/minix/com.h:

# define SUSPEND -998 /* used in interrupts when tty has no data */

An example of a suspended operation is a read operation that is requested when there is no data waiting.

The suspend flag (as well as the ip flag) is ultimately cleared on line 15591 when data becomes available in the user process (for a write or ioctl operation) or from the ethernet buffer (for a read or ioctl operation).


0015358          return r;
0015359 }
0015360 
0015361 PRIVATE int sr_cancel(m)
0015362 message *m;
sr_cancel()

Unlike sr_open(), sr_close(), and sr_rwio(), sr_cancel() does not call the protocol specific cancel function directly. Instead, sr_cancel() calls walk_queue(), which walks the message queue for the given channel (sr_fd) for each of the queues (read, write, and ioctl) until it finds a message with the same process number.

If the message is the first message in the queue, walk_queue() calls the protocol specific cancel function (see line 15422). If a message is found but it is not the first message in the queue, walk_queue() removes the message from the queue and returns EINTR. If a message is not found, walk_queue() returns EAGAIN (which is a problem).


0015363 {
0015364          sr_fd_t *sr_fd;
0015365          int i, result;
0015366          mq_t *q_ptr, *q_ptr_prv;
0015367          int proc_nr, ref, operation;
0015368 
0015369  result=EINTR;
EINTR error number is #define'd in include/errno.h as:

#define EINTR (_SIGN 4) /* interrupted function call */

It is unclear why result is set here. It will be reset later in the function.



0015370          proc_nr= m->PROC_NR;
proc_nr is the process number of the sender.


0015371          ref= 0;
0015372          operation= 0;
0015373          sr_fd= sr_getchannel(m->DEVICE);
sr_getchannel()

For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.

Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].


0015374          assert (sr_fd);
0015375 
Walk the ioctl, read, and write queues (in that order) until the message is found.


0015376          {
0015377                   result= walk_queue(sr_fd, sr_fd->srf_ioctl_q,
0015378                            &sr_fd->srf_ioctl_q_tail, SR_CANCEL_IOCTL,
0015379                            proc_nr, ref);
walk_queue()

For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.

ref, the last parameter of walk_queue(), is never used.


0015380                   if (result != EAGAIN)
0015381                            return result;
0015382          }
0015383          {
0015384                   result= walk_queue(sr_fd, sr_fd->srf_read_q,
0015385                            &sr_fd->srf_read_q_tail, SR_CANCEL_READ,
0015386                            proc_nr, ref);
walk_queue()

For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.

ref, the last parameter of walk_queue(), is never used.


0015387                   if (result != EAGAIN)
0015388                            return result;
0015389          }
0015390          {
0015391                   result= walk_queue(sr_fd, sr_fd->srf_write_q,
0015392                            &sr_fd->srf_write_q_tail, SR_CANCEL_WRITE,
0015393                            proc_nr, ref);
walk_queue()

For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.

ref, the last parameter of walk_queue(), is never used.


0015394                   if (result != EAGAIN)
0015395                            return result;
0015396          }
A message should have been found in one of the queues.


0015397 #if !CRAMPED
0015398          ip_panic((
0015399 "request not found: from %d, type %d, MINOR= %d, PROC= %d, REF= %d OPERATION= %d",
0015400                   m->m_source, m->m_type, m->DEVICE,
0015401                   m->PROC_NR, 0, 0));
0015402 #endif
0015403 }
0015404 
0015405 PRIVATE int walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref)
0015406 sr_fd_t *sr_fd;
0015407 mq_t *q_head, **q_tail_ptr;
0015408 int type;
0015409 int proc_nr;
0015410 int ref;
walk_queue()

For a given channel (i.e., for a given sr_fd) in sr_fd_table[], walk_queue(sr_fd, q_head, q_tail_ptr, type, proc_nr, ref) finds a message in the appropriate queue (srf_read_q, srf_write_q or srf_ioctl_q) for the given process number (proc_nr). If a message with this process number is found and the message is the first in the queue, call the protocol specific cancel function (e.g., udp_cancel()) and return OK. If a message with this process number is found and it is not the first in the queue, simply remove the message from the queue and return EINTR. If a message with the given process number can't be found in the queue, return EAGAIN. The difference between the first message and any other message in the queue is that the first message is already being processed and therefore some extra cleanup must be done.

ref, the last parameter of walk_queue(), is never used.


0015411 {
0015412          mq_t *q_ptr_prv, *q_ptr;
0015413          int result;
0015414 
Go through each of the messages in the appropriate queue.


0015415          for(q_ptr_prv= NULL, q_ptr= q_head; q_ptr;
0015416                   q_ptr_prv= q_ptr, q_ptr= q_ptr->mq_next)
0015417          {
0015418                   if (q_ptr->mq_mess.PROC_NR != proc_nr)
0015419                            continue;
Check whether the process number of the message in the queue is the same as the process number of the message being cancelled.


0015420                   if (!q_ptr_prv)
The message is the first message in the queue (the message being processed - see line 15320).


0015421                   {
0015422                            result= (*sr_fd->srf_cancel)(sr_fd->srf_fd, type);
Call the protocol specific cancel function (e.g., udp_cancel()) and return OK.


0015423                            assert(result == OK);
0015424                            return OK;
0015425                   }
0015426                   q_ptr_prv->mq_next= q_ptr->mq_next;
The message is not the first message in the queue.

There is a distinction between an operation that has been started and has been
interrupted and an operation that is still on the message queue and has been interrupted. In the first case, the call to the protocol specific cancel function causes a reply to be sent to the file system. In the second case, sr_rec() (line 15122) must send the reply.

Remove the message from the queue. If the message is the last message in the queue, set the tail of the queue to point to the previous message.


0015427                   mq_free(q_ptr);
0015428                   if (!q_ptr_prv->mq_next)
0015429                            *q_tail_ptr= q_ptr_prv;
0015430                   return EINTR;
A return value of OK (line 15424) means that sr_rec() (line 15211) doesn't have to do anything (srf_cancel did the work). EINTR means that sr_rec() must send a result (EINTR) to the file system.


0015431          }
0015432          return EAGAIN;
The message was not found in the given queue.


0015433 }
0015434 
0015435 PRIVATE sr_fd_t *sr_getchannel(minor)
sr_getchannel()

For a given channel (i.e., an index within sr_fd_table[]), sr_getchannel(minor) returns a pointer to the corresponding sr_fd element.

Note that minor, sr_getchannel()'s only parameter, does not correspond to a minor device. Instead, minor corresponds to a normal slot within sr_fd_table[].


0015436 int minor;
0015437 {
0015438          sr_fd_t *loc_fd;
0015439 
Verify that minor is an acceptable value (0 <= minor <= 63). If the slot number is valid, return a pointer to the slot (i.e., the corresponding sr_fd struct).


0015440          compare(minor, >=, 0);
compare()

compare is #define'd in inet/generic/assert.h:

#define compare(a,t,b) (!((a) t (b)) ? bad_compare(this_file, __LINE__, \
(a), #a " " #t " " #b, (b)) : (void) 0)

and bad_compare() is defined in inet/inet.c.

If the relationship between the 3 arguments in compare() does not hold, some debugging output is emitted and then Minix is terminated.

For example, if compare(result, >=, 0) is called and result (the first argument) is -1, Minix will be terminated.


0015441          compare(minor, <, FD_NR);
compare()

compare is #define'd in inet/generic/assert.h:

#define compare(a,t,b) (!((a) t (b)) ? bad_compare(this_file, __LINE__, \
(a), #a " " #t " " #b, (b)) : (void) 0)

and bad_compare() is defined in inet/inet.c.

If the relationship between the 3 arguments in compare() does not hold, some debugging output is emitted and then Minix is terminated.

For example, if compare(result, >=, 0) is called and result (the first argument) is -1, Minix will be terminated.


0015442 
0015443          loc_fd= &sr_fd_table[minor];
0015444 
0015445          assert (!(loc_fd->srf_flags & SFF_MINOR) &&
0015446                   (loc_fd->srf_flags & SFF_INUSE));
0015447 
0015448          return loc_fd;
0015449 }
0015450 
0015451 PRIVATE void sr_reply (mq, status, can_enqueue)
0015452 mq_t *mq;
0015453 int status;
0015454 int can_enqueue;
sr_reply()

sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().

If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().

However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().

It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).

If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.

If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.


0015455 {
0015456          int result, proc, ref,operation;
0015457          message reply, *mp;
0015458 
0015459          proc= mq->mq_mess.PROC_NR;
PROC_NR is the process number that requested the operation from the file system (FS).


0015460          ref= 0;
0015461          operation= mq->mq_mess.m_type;
operation will be one of the following: DEV_OPEN, DEV_CLOSE, DEV_READ, DEV_WRITE, DEV_IOCTL, or CANCEL.


0015462 
0015463          if (can_enqueue)
If the message fails due to deadlock but should be retried later (i.e., should be placed in repl_queue), can_enqueue is TRUE. For an explanation of when can_enqueue is TRUE and when can_enqueue is FALSE, see the comment for line 15454.


0015464                   mp= &mq->mq_mess;
0015465          else
0015466                   mp= &reply;
0015467 
0015468          mp->m_type= REVIVE;
A message of type REVIVE 'revives' a previously blocked process. Other than cancel messages, REVIVE is the only type of message that the file system expects from the network service.

REVIVE is #define'd in include/minix/callnr.h:

#define REVIVE 67 /* to FS: revive a sleeping process */



0015469          mp->REP_PROC_NR= proc;
0015470          mp->REP_STATUS= status;
status (the second parameter to sr_reply()) can have many different values. For example, if a read/write/ioctl operation was suspended, status will equal SUSPEND (see line 15156). If an open operation was requested and the operation was successful, status will be the resulting file descriptor (for example, a udp file descriptor).


0015471          result= send(mq->mq_mess.m_source, mp);
Send the REVIVE message to the file system.

Click here for a detailed description of send().


0015472          if (result == ELOCKED && can_enqueue)
It is possible that the file system is also attempting to send the network server a message. This will result in a deadlock. For a detailed description of deadlocks and how they are handled, read the comment on line 15454.


0015473          {
0015474                   if (repl_queue)
Place the message at the tail of the queue (which may also be the head of the queue).


repl_queue / repl_queue_tail


repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.

For a detailed description of what circumstances lead to a deadlock, click here.


0015475                            repl_queue_tail->mq_next= mq;
0015476                   else
0015477                            repl_queue= mq;
0015478                   repl_queue_tail= mq;
In addition to adding a message to repl_queue, the message's mq_next field should be set to NULL (since it is the tail of the queue). Future versions of the network service will fix this.


0015479                   return;
0015480          }
0015481          if (result != OK)
0015482                   ip_panic(("unable to send"));
0015483          if (can_enqueue)
As described above, if the message is no longer needed, free the message.


0015484                   mq_free(mq);
mq_free()

mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.



Click here for a detailed description of mq_list[] and mq_freelist.


0015485 }
0015486 
0015487 PRIVATE acc_t *sr_get_userdata (fd, offset, count, for_ioctl)
0015488 int fd;
0015489 vir_bytes offset;
0015490 vir_bytes count;
0015491 int for_ioctl;
sr_get_userdata()

sr_get_userdata() is the counterpart to sr_put_userdata() and does one of two things:

1) Copies data from a user process to a buffer (to be more specific, a chain of accessors) within the network service (this process). This can be either ioctl data (in which case, for_ioctl is TRUE) or data. For example, udp_setopt() (indirectly) calls sr_get_userdata() to get configuration data. Also, restart_write_fd() (indirectly) calls sr_get_userdata() before passing data onto the ip code.

2) Sends a REVIVE message to the file system (FS). For example, if an illegal option is selected while configuring a udp file descriptor, reply_thr_get() is called, which then (indirectly) calls sr_get_userdata(), passing in EBADMODE for the parameter count. restart_write_fd() also (indirectly) calls sr_get_userdata() to send a REVIVE message back to the FS indicating the number of bytes read after copying the data from the user process.

sr_get_userdata() is often called twice in close succession. The first time to attempt to copy the data from the user process and then the second time to send a message to the FS indicating whether the copy operation was successful and, if it was successful, the number of bytes copied.

In my opinion, like sr_put_userdata(), this function should have been made into two functions. As it is, it is too confusing.


0015492 {
0015493          sr_fd_t *loc_fd;
0015494          mq_t **head_ptr, **tail_ptr, *m, *tail, *mq;
0015495          int ip_flag, susp_flag;
0015496          int result;
0015497          int suspended;
0015498          char *src;
0015499          acc_t *acc;
0015500 
0015501          loc_fd= &sr_fd_table[fd];
Find the relevant sr_fd within sr_fd_table[].


0015502 
sr_get_userdata() is called for ioctl and write requests. The following if/else statement chooses the appropriate message queues and flags for the requested operation.

It is very important to note that the messages in the srf_ioctl_q and srf_write_q queues were placed there by sr_rwio() on lines 15309-15320.


0015503          if (for_ioctl)
0015504          {
0015505                   head_ptr= &loc_fd->srf_ioctl_q;
0015506                   tail_ptr= &loc_fd->srf_ioctl_q_tail;
0015507                   ip_flag= SFF_IOCTL_IP;
0015508                   susp_flag= SFF_IOCTL_SUSP;
0015509          }
0015510          else
0015511          {
0015512                   head_ptr= &loc_fd->srf_write_q;
0015513                   tail_ptr= &loc_fd->srf_write_q_tail;
0015514                   ip_flag= SFF_WRITE_IP;
0015515                   susp_flag= SFF_WRITE_SUSP;
0015516          }
0015517                   
0015518 assert (loc_fd->srf_flags & ip_flag);
0015519 
0015520          if (!count)
If count (the third parameter of sr_get_userdata()) is zero, sr_get_userdata() sends a message to the file system (by calling sr_reply()) and then processes the messages that couldn't be delivered previously (by calling process_req_q()).

The variable m is the first message in the queue and mq is the second message in the queue. Note that it is more than likely that the network service has serviced the request of message m. For example, if message m was a request for a write operation, a message was sent (by cp_u2b()) to SYSTASK requesting a data transfer from a user process. (Note that, although unlikely, the call to cp_u2b() may not have been successful.) Only a message to the FS indicating whether the operation was successful must now be sent. This is not true for the rest of the messages in the queue, which still must be processed.


0015521          {
0015522                   m= *head_ptr;
0015523                   *head_ptr= NULL;
0015524                   tail= *tail_ptr;
0015525 assert(m);
0015526                   mq= m->mq_next;
Note that mq is the second message in the queue. m is the first message and an attempt has already been made to process the message. sr_reply() will send a message indicating whether the attempt was successful.


0015527                   result= (int)offset;
0015528                   sr_reply (m, result, 1);
sr_reply()

sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().

If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().

However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().

It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).

If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.

If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.


0015529                   suspended= (loc_fd->srf_flags & susp_flag);
0015530                   loc_fd->srf_flags &= ~(ip_flag|susp_flag);
0015531                   if (suspended)
0015532                   {
0015533                            process_req_q(mq, tail, tail_ptr);
process_req_q()

If a message arrives but the message's request cannot be satisifed (for example, because there was no data available to be read), the message is put into a queue (e.g., srf_read_q). When the request can finally be processed, the first message in the queue is processed and then process_req_q() is called to process all of the remaining messages in the queue. process_req_q() does this by calling sr_rwio() repeatedly until either there are no remaining messages in the queue or sr_rwio() returns SUSPEND. (For example, sr_rwio() will return SUSPEND if there is not enough data to satisfy a read request.)

process_req_q() has three parameters:

mq - the first message in a linked list to be processed

tail - the address of the last message in the linked list

tail_ptr - the address of the relevant field of the file descriptor. For example, the relevant field for a read request will be the srf_read_q_tail field.

To see process_req_q() in action, consider the following example where messages requesting reads arrive yet there is no data waiting to be read:

Message 1 arrives, requesting a read operation. sr_rec() calls sr_rwio(), which checks whether the read flag was previously set. The read flag is not set, indicating that there has not been a previously unfulfilled read request. Therefore, the message is placed in the read queue (it becomes the head and the tail of the queue) and the protocol-specific read function is called (e.g., udp_read()). There is no data waiting to be read by the user process so the request is suspended (i.e., the suspend flag for the file descriptor is raised). Four further messages (messages 2-5) arrive and are placed at the end of the read queue. At this point, the queue is as follows:



Finally, data arrives. As a result, sr_put_userdata() is called twice. The first time sr_put_userdata() is called, the data requested by message 1 is copied to the user process. The second time sr_put_userdata() is called, sr_reply() is called, which sends a "REVIVE" message back to the FS specifying how many bytes were copied. Message 1 is then taken off the queue. At this point, the queue is as follows:



After sr_reply() returns, sr_put_userdata() calls process_req_q(), with a pointer to message 2 as the first argument, a pointer to message 5 as the second argument, and a pointer to the srf_read_q_tail field of the sr file descriptor as the third argument. process_req_q() repeatedly calls sr_rwio() until there are either no more messages or there is no more data available to satisfy the remaining requests. Let's assume that there is enough data to satisfy the first 2 messages. After satisfying the first two messages, the queue is as follows:





0015534                   }
0015535                   else
0015536                   {
If there's nothing in repl_queue, there's no need to do anything.


0015537 assert(!mq);
0015538                   }
0015539                   return NULL;
0015540          }
0015541 
0015542          src= (*head_ptr)->mq_mess.ADDRESS + offset;
0015543          result= cp_u2b ((*head_ptr)->mq_mess.PROC_NR, src, &acc, count);
cp_u2b()

cp_u2b() copies size (the fourth parameter) bytes from an address in a user process (src, the second parameter) to a linked list of buffers within the network service. The address of the first buffer in the linked list is returned through the third parameter. The process number of the user process is the first parameter and is a field in the message requesting the copy.

cp_u2b() is called only by sr_get_userdata().

"u2b" stands for "user to buffer".


0015544 
0015545          return result<0 ? NULL : acc;
0015546 }
0015547 
0015548 PRIVATE int sr_put_userdata (fd, offset, data, for_ioctl)
0015549 int fd;
0015550 vir_bytes offset;
0015551 acc_t *data;
0015552 int for_ioctl;
sr_put_userdata()

sr_put_userdata(fd, offset, data, for_ioctl) is the counterpart to sr_get_userdata() and (like sr_get_userdata()) does one of two things:

1) Copies data from a buffer (to be more specific, a chain of accessors) within the network service (this process) to a buffer within the user process. This can be either ioctl data (in which case, for_ioctl is TRUE) or read/write data (for_ioctl is FALSE). For example, udp_ioctl() (indirectly) calls sr_put_userdata() to give configuration data to a user process. Also, udp_packet2user() (indirectly) calls sr_get_userdata() to pass data to the user process.

2) Sends a message to the FS. For example, if a read is attempted on a udp file descriptor before the file descriptor is configured, reply_thr_put() is called, which then (indirectly) calls sr_put_userdata(), passing in EBADMODE for the parameter count.

In my opinion, like sr_get_userdata(), this should have been made into two functions. As it is, it is too confusing.


0015553 {
0015554          sr_fd_t *loc_fd;
0015555          mq_t **head_ptr, **tail_ptr, *m, *tail, *mq;
0015556          int ip_flag, susp_flag;
0015557          int result;
0015558          int suspended;
0015559          char *dst;
0015560 
0015561          loc_fd= &sr_fd_table[fd];
sr_fd_table[] is an array that contains the sr file descriptors. For a complete description of sr_fd_table[], click here.


0015562 
0015563          if (for_ioctl)
sr_put_userdata() is called for two different requests: ioctl and read. The following if/else statement chooses the appropriate queues and flags for the requested operation.

The messages in the srf_ioctl_q and srf_read_q queues were placed there by sr_rwio() on lines 15309-15320.


0015564          {
0015565                   head_ptr= &loc_fd->srf_ioctl_q;
0015566                   tail_ptr= &loc_fd->srf_ioctl_q_tail;
0015567                   ip_flag= SFF_IOCTL_IP;
0015568                   susp_flag= SFF_IOCTL_SUSP;
0015569          }
0015570          else
0015571          {
0015572                   head_ptr= &loc_fd->srf_read_q;
0015573                   tail_ptr= &loc_fd->srf_read_q_tail;
0015574                   ip_flag= SFF_READ_IP;
0015575                   susp_flag= SFF_READ_SUSP;
0015576          }
0015577                   
0015578          assert (loc_fd->srf_flags & ip_flag);
0015579 
0015580          if (!data)
If data, sr_put_userdata()'s third parameter, is zero, sr_put_userdata() sends a message to the file system (by calling sr_reply()) and then processes the messages that couldn't be delivered previously (by calling process_req_q()).

The variable m is the first message in the queue and mq is the second message in the queue. Note that it is more than likely that the network service has serviced the request of message m. For example, if message m was a request for a read operation, a message was sent (by cp_b2u()) to SYSTASK requesting data from a user process. (Note that, although unlikely, the call to cp_u2b() may not have been successful.) Only a message to the FS indicating whether the operation was successful must now be sent. This is not true for the rest of the messages in the queue, which still must be processed.


0015581          {
0015582                   m= *head_ptr;
0015583                   assert(m);
0015584 
0015585                   *head_ptr= NULL;
0015586                   tail= *tail_ptr;
0015587                   mq= m->mq_next;
Note that mq is the second message in the queue. m is the first message and will be handled by sr_reply() below.


0015588                   result= (int)offset;
0015589                   sr_reply (m, result, 1);
sr_reply()

sr_reply(mq, status, can_enqueue) attempts to send a REVIVE message back to the file system (FS) (which is what the FS expects to receive). sr_reply() is called from sr_rec() and from sr_get_userdata() and sr_put_userdata().

If called from sr_rec(), there are two scenarios. If the message mq (sr_reply()'s first parameter) was a request for a DEV_OPEN, DEV_CLOSE, or CANCEL, a reply is guaranteed to succeed and therefore there is no need to enqueue the message in repl_queue. If the message mq was a request for a DEV_READ, DEV_WRITE, or DEV_IOCTL operation that was suspended, a reply is also guaranteed to succeed. In both of these scenarios, it is not necessary to queue the message in repl_queue and, therefore, can_enqueue is FALSE if called from sr_rec().

However, if sr_reply() is called from sr_get_userdata() or sr_put_userdata(), the message may fail due to deadlock. Also, the message must eventually reach the file system so that the user process may be unblocked. Therefore, the reply message must be placed in repl_queue if the message doesn't succeed due to deadlock. For this reason, can_enqueue is TRUE if called from sr_get_userdata() or sr_put_userdata().

It is important to understand why deadlocks can occur. During the initialization of the network service, the network service calls svrctl() (which calls do_svrctl()) to initialize itself with the FS. do_svrctl() sets the fields of the network service's dmap[] entry to gen_io() for read, write, and ioctl requests and clone_opcl() for open and close requests. For this reason, whenever a read, write, or ioctl requests for a network file is received by the file system, gen_io() is called and whenever an open or close request is received by the file system, clone_opcl() is called (note that clone_opcl() calls gen_io()).

If gen_io() calls sendrec() to send a message to the network service and there is a deadlock because the the network service also attempted to send a message to the file system (i.e., sendrec() returned ELOCKED), gen_io() will call receive() instead.

If the read, write, or ioctl request could not be immediately satisfied, the network service will later send() another REVIVE message after the read, write, or ioctl operation has completed (successfully or unsuccessfully). However, the file system will not necessarily handle any deadlock that occurs (since
the main loop in the file system may possibly handle the message) and so, if a deadlock occurs, the network service must later attempt to send the REVIVE message that was not successful due to the deadlock. The network service does this by placing the message in the repl_queue. The messages in repl_queue will be sent again the next time that sr_repl_queue() is called.


0015590                   suspended= (loc_fd->srf_flags & susp_flag);
0015591                   loc_fd->srf_flags &= ~(ip_flag|susp_flag);
0015592                   if (suspended)
0015593                   {
0015594                            process_req_q(mq, tail, tail_ptr);
process_req_q()

If a message arrives but the message's request cannot be satisifed (for example, because there was no data available to be read), the message is put into a queue (e.g., srf_read_q). When the request can finally be processed, the first message in the queue is processed and then process_req_q() is called to process all of the remaining messages in the queue. process_req_q() does this by calling sr_rwio() repeatedly until either there are no remaining messages in the queue or sr_rwio() returns SUSPEND. (For example, sr_rwio() will return SUSPEND if there is not enough data to satisfy a read request.)

process_req_q() has three parameters:

mq - the first message in a linked list to be processed

tail - the address of the last message in the linked list

tail_ptr - the address of the relevant field of the file descriptor. For example, the relevant field for a read request will be the srf_read_q_tail field.

To see process_req_q() in action, consider the following example where messages requesting reads arrive yet there is no data waiting to be read:

Message 1 arrives, requesting a read operation. sr_rec() calls sr_rwio(), which checks whether the read flag was previously set. The read flag is not set, indicating that there has not been a previously unfulfilled read request. Therefore, the message is placed in the read queue (it becomes the head and the tail of the queue) and the protocol-specific read function is called (e.g., udp_read()). There is no data waiting to be read by the user process so the request is suspended (i.e., the suspend flag for the file descriptor is raised). Four further messages (messages 2-5) arrive and are placed at the end of the read queue. At this point, the queue is as follows:



Finally, data arrives. As a result, sr_put_userdata() is called twice. The first time sr_put_userdata() is called, the data requested by message 1 is copied to the user process. The second time sr_put_userdata() is called, sr_reply() is called, which sends a "REVIVE" message back to the FS specifying how many bytes were copied. Message 1 is then taken off the queue. At this point, the queue is as follows:



After sr_reply() returns, sr_put_userdata() calls process_req_q(), with a pointer to message 2 as the first argument, a pointer to message 5 as the second argument, and a pointer to the srf_read_q_tail field of the sr file descriptor as the third argument. process_req_q() repeatedly calls sr_rwio() until there are either no more messages or there is no more data available to satisfy the remaining requests. Let's assume that there is enough data to satisfy the first 2 messages. After satisfying the first two messages, the queue is as follows:





0015595                   }
0015596                   else
0015597                   {
If there's nothing in repl_queue, there's no need to do anything.


0015598                            assert(!mq);
0015599                   }
0015600                   return OK;
0015601          }
0015602 
0015603          dst= (*head_ptr)->mq_mess.ADDRESS + offset;
0015604          return cp_b2u (data, (*head_ptr)->mq_mess.PROC_NR, dst);
cp_b2u()

cp_u2b(acc_ptr, proc, dest) copies all of the data in the accessor linked list acc_ptr, cp_u2b()'s first buffer, to a location dest, cp_u2b()'s third parameter, in the user process proc, cp_b2u()'s third parameter.

cp_u2b() is called only by sr_put_userdata().

"b2u" stands for "buffer to user".


0015605 }
0015606 
0015607 PRIVATE void process_req_q(mq, tail, tail_ptr)
0015608 mq_t *mq, *tail, **tail_ptr;
process_req_q()

If a message arrives but the message's request cannot be satisifed (for example, because there was no data available to be read), the message is put into a queue (e.g., srf_read_q). When the request can finally be processed, the first message in the queue is processed and then process_req_q() is called to process all of the remaining messages in the queue. process_req_q() does this by calling sr_rwio() repeatedly until either there are no remaining messages in the queue or sr_rwio() returns SUSPEND. (For example, sr_rwio() will return SUSPEND if there is not enough data to satisfy a read request.)

process_req_q() has three parameters:

mq - the first message in a linked list to be processed

tail - the address of the last message in the linked list

tail_ptr - the address of the relevant field of the file descriptor. For example, the relevant field for a read request will be the srf_read_q_tail field.

To see process_req_q() in action, consider the following example where messages requesting reads arrive yet there is no data waiting to be read:

Message 1 arrives, requesting a read operation. sr_rec() calls sr_rwio(), which checks whether the read flag was previously set. The read flag is not set, indicating that there has not been a previously unfulfilled read request. Therefore, the message is placed in the read queue (it becomes the head and the tail of the queue) and the protocol-specific read function is called (e.g., udp_read()). There is no data waiting to be read by the user process so the request is suspended (i.e., the suspend flag for the file descriptor is raised). Four further messages (messages 2-5) arrive and are placed at the end of the read queue. At this point, the queue is as follows:



Finally, data arrives. As a result, sr_put_userdata() is called twice. The first time sr_put_userdata() is called, the data requested by message 1 is copied to the user process. The second time sr_put_userdata() is called, sr_reply() is called, which sends a "REVIVE" message back to the FS specifying how many bytes were copied. Message 1 is then taken off the queue. At this point, the queue is as follows:



After sr_reply() returns, sr_put_userdata() calls process_req_q(), with a pointer to message 2 as the first argument, a pointer to message 5 as the second argument, and a pointer to the srf_read_q_tail field of the sr file descriptor as the third argument. process_req_q() repeatedly calls sr_rwio() until there are either no more messages or there is no more data available to satisfy the remaining requests. Let's assume that there is enough data to satisfy the first 2 messages. After satisfying the first two messages, the queue is as follows:





0015609 {
0015610          mq_t *m;
0015611          int result;
0015612 
0015613          for(;mq;)
Process the linked list of messages by calling sr_rwio() for each message. Quit after all of the messages have been processed by sr_rwio() or when sr_rwio() returns SUSPEND (sr_rwio() will return SUSPEND, for example, if there is not enough data to satisfy a read request).


0015614          {
0015615                   m= mq;
0015616                   mq= mq->mq_next;
0015617 
0015618                   DBLOCK(1, printf("calling rwio\n"));
0015619 
0015620                   result= sr_rwio(m);
sr_rwio()

sr_rwio() is either called by sr_rec() upon receipt of a read, write, or ioctl request or by process_req_q() to handle queued messages. The handling of a read, write, or ioctl request is a complex operation that is best explained by an example. We will analyze two different read requests, one with data ready to be delivered to user processes and one with no data ready to be delivered.

SCENARIO 1: There is data ready to be delivered to the user process and there are no messages in the read queue (i.e., srf_read_q, srf_read_q_tail == null). The file descriptor was opened by the udp client.

When a message arrives requesting a read operation (i.e., m->mq_mess.m_type == DEV_READ), udp_read() is called, which in turn calls udp_packet2user(). udp_packet2user() calls sr_put_userdata() twice. The first time sr_put_userdata() is called, the data is transferred to the user process. The second time sr_put_userdata() is called, a reply is sent to the user process by sr_reply(), which ultimately calls mq_free(), freeing the message.

SCENARIO 2: There are no read messages waiting in the read queue of a file descriptor opened by the udp client. However, this time, there is no data waiting to be read.

A DEV_READ message (message 1) arrives and sr_rec() calls sr_rwio() to handle the message. Since there are no messages in the read queue, SFF_READ_IP is not raised. After setting the SFF_READ_IP flag, sr_rwio() calls udp_read(). However, since no data is waiting to be read, udp_read() returns NW_SUSPEND. sr_rwio() then sets the SFF_READ_SUSP flag and returns NW_SUSPEND to sr_rec(), which then sends a REVIVE message back to the file system (in other words, it instructs the FS to revive the process requesting the read). After this, the sr file descriptor is shown:



A second DEV_READ message (message 2) arrives and sr_rec() again calls sr_rwio(). Because the SFF_READ_IP flag is already raised by the first message, the second message is appended to the read queue and sr_rwio() returns NW_SUSPEND. Again, sr_reply() sends a REVIVE message to the file system.



A third DEV_READ message (message 3) arrives and, as before, is placed at the end of the read queue and a REVIVE message is sent to the file system.



Finally, some data arrives (possibly from the ethernet task). udp_packet2user() is ultimately called and it, in turn, calls sr_put_userdata() twice: the first time to deliver the packet to the buffer of the user process and the second time to return the number of bytes delivered to the user process. The second time sr_put_data() is called, it calls sr_reply(), which sets the read queue head to null.



After sr_reply() returns, sr_put_userdata() then calls process_req_q(), passing in mq as the first parameter. process_req_q() goes through the remaining messages, starting with mq (see figure above), handling the read requests (by calling sr_rwio()) if possible. If there is enough data to satisfy all of the read requests (i.e., the requests of messages 2 and 3), the read queue is as appears in the figure on the left. If, for example, there is not even enough data to satisfy the request of message 2, the message queue is as appears in the figure on the right.



If there was enough data to satisfy the requests of messages 2 and 3, the remaining messages are freed by sr_reply(), just as message 1 was previously freed. Note that the sr_read_q field in the right-hand figure above was set by sr_rwio() before returning.


0015621                   if (result == SUSPEND)
sr_rwio() returns SUSPEND if the request couldn't be satisifed. For example, if there was no data to be read and the message was a read request, sr_rwio() would return SUSPEND.


0015622                   {
0015623                            if (mq)
0015624                            {
0015625                                     (*tail_ptr)->mq_next= mq;
This is a mistake. The queue should not be circular.


0015626                                     *tail_ptr= tail;
Reset the tail. Each time that sr_rwio() was called to process a message, both the head and the tail pointed to the message.


0015627                            }
0015628                            return;
0015629                   }
0015630          }
0015631          return;
0015632 }
0015633 
0015634 PRIVATE int cp_u2b (proc, src, var_acc_ptr, size)
0015635 int proc;
0015636 char *src;
0015637 acc_t **var_acc_ptr;
0015638 int size;
cp_u2b()

cp_u2b() copies size (the fourth parameter) bytes from an address in a user process (src, the second parameter) to a linked list of buffers within the network service. The address of the first buffer in the linked list is returned through the third parameter. The process number of the user process is the first parameter and is a field in the message requesting the copy.

cp_u2b() is called only by sr_get_userdata().

"u2b" stands for "user to buffer".


0015639 {
0015640          static message mess;
0015641          acc_t *acc;
0015642          int i;
0015643 
0015644          acc= bf_memreq(size);
bf_memreq()

After the buffers have been initialized, accessors[] looks like the following:



bf_memreq() allocates accessors to the caller. For example, if 1514 bytes of buffer space are requested immediately after the network process starts and each buffer is 512 bytes (the default), then accessors[] will look like the following:



Note that three elements of accessors[] have been removed from buf512_freelist and that the head of the chain of the 3 accessors is returned by bf_memreq(). Also note that the acc_linkC and buf_linkC fields have been set to one and acc_length and acc_offset have been set to their appropriate values.

So what happens if there are not enough buffers on the buf512_freelist to satisfy a request? On lines 2280-2290 of buf.c, functions that free buffers for the specific clients (e.g., eth_buffree()) are called until there are enough buffers on buf512_freelist.

For a complete description of the network service's buffer management, click here.


0015645 
0015646          *var_acc_ptr= acc;
One means of passing back a value to the caller of a function is through a reference. In this way, cp_u2b() passes back the address of the linked list of allocated buffers containing the data copied from the user process (see line 15543).


0015647          i=0;
0015648 
0015649          while (acc)
This while loop sets up the SYS_VCOPY (V stands for vector) message on line 15668. Before the message is sent, a vector (cpvec) of all the addresses and sizes of the buffers must be built. The data within the user process is copied to the addresses contained within this vector.


0015650          {
0015651                   size= (vir_bytes)acc->acc_length;
0015652 
cpvec[] is defined in this file:

PRIVATE cpvec_t cpvec[CPVEC_NR];

and cpvec_t is declared in include/minix/type.h:

typedef struct {

vir_bytes cpv_src; /* src address of data */
vir_bytes cpv_dst; /* dst address of data */
vir_bytes cpv_size; /* size of data */
} cpvec_t;



0015653                   cpvec[i].cpv_src= (vir_bytes)src;
0015654                   cpvec[i].cpv_dst= (vir_bytes)ptr2acc_data(acc);
ptr2acc_data()

The macro ptr2acc_data is #define'd in inet/generic/buf.h as:

#define ptr2acc_data(/* acc_t * */ a) (bf_temporary_acc=(a), \
(&bf_temporary_acc->acc_buffer->buf_data_p[bf_temporary_acc-> \
acc_offset]))

ptr2acc_data() simply returns a pointer to the actual data within an accessor.

ptr2acc_data() is usually called so that the fields of a header (e.g., ip header) can be analyzed.


0015655                   cpvec[i].cpv_size= size;
0015656 
0015657                   src += size;
0015658                   acc= acc->acc_next;
0015659                   i++;
0015660 
0015661                   if (i == CPVEC_NR || acc == NULL)
For a SYS_VCOPY request, the vector can be a maximum of CPVEC_NR (16) elements. If the size of the vector is less than or equal to CPVEC_NR elements, all of the data from the user process can be copied with a single message. If not, multiple messages to SYSTASK must be sent.


0015662                   {
0015663                            mess.m_type= SYS_VCOPY;
0015664                            mess.m1_i1= proc;
0015665                            mess.m1_i2= this_proc;
0015666                            mess.m1_i3= i;
0015667                            mess.m1_p1= (char *)cpvec;
0015668                            if (sendrec(SYSTASK, &mess) <0)
0015669                                     ip_panic(("unable to sendrec"));
Send the message to SYSTASK (see OSDI, Section 3.10) requesting the copy and wait for it to finish.

The message will have the following format:

m_source m_type (SYS_VCOPY) src process dst process vec size (m1_i3) vc addr (char *m1p1) - -


Click here for a detailed description of sendrec().


0015670                            if (mess.m_type <0)
The sendrec() attempt was unsuccessful.


0015671                            {
0015672                                     bf_afree(*var_acc_ptr);
bf_afree()

After a chain of accessors is no longer needed, the chain (and not simply the single accessor passed as the parameter) can be freed by calling bf_free(). However, if either acc_linkC or buf_linkC of one of the accessors in the linked list is not equal to one (1), the entire chain will not be freed. For example, if buf_afree(acc1) is called for the following chain:



Then the resulting chain will be:



bf_afree() returns acc1 (accessors[63]) to acc_freelist (recall that acc_freelist is the linked list of acc_t's without an associated buffer). However, buffers512[127] cannot be freed because acc2 (accessors[64]) still references it.

bf_afree() is called after an accessor's associated data is no longer needed (for example, after a packet has been sent off by the ethernet driver).


0015673                                     *var_acc_ptr= 0;
0015674                                     return mess.m_type;
0015675                            }
0015676                            i= 0;
0015677                   }
0015678          }
0015679          return OK;
0015680 }
0015681 
0015682 PRIVATE int cp_b2u (acc_ptr, proc, dest)
0015683 acc_t *acc_ptr;
0015684 int proc;
0015685 char *dest;
cp_b2u()

cp_u2b(acc_ptr, proc, dest) copies all of the data in the accessor linked list acc_ptr, cp_u2b()'s first buffer, to a location dest, cp_u2b()'s third parameter, in the user process proc, cp_b2u()'s third parameter.

cp_u2b() is called only by sr_put_userdata().

"b2u" stands for "buffer to user".


0015686 {
0015687          static message mess;
0015688          acc_t *acc;
0015689          int i, size;
0015690 
0015691          acc= acc_ptr;
0015692          i=0;
0015693 
0015694          while (acc)
This while loop sets up the SYS_VCOPY (V stands for vector) message on line 15716. Before the message is sent, a vector (cpvec) of all the addresses and sizes of the buffers must be built. The data within these buffers is copied to the user process.


0015695          {
0015696                   size= (vir_bytes)acc->acc_length;
0015697 
0015698                   if (size)
0015699                   {
cpvec[] is defined in this file:

PRIVATE cpvec_t cpvec[CPVEC_NR];

and cpvec_t is declared in include/minix/type.h:

typedef struct {

vir_bytes cpv_src; /* src address of data */
vir_bytes cpv_dst; /* dst address of data */
vir_bytes cpv_size; /* size of data */
} cpvec_t;



0015700                            cpvec[i].cpv_src= (vir_bytes)ptr2acc_data(acc);
ptr2acc_data()

The macro ptr2acc_data is #define'd in inet/generic/buf.h as:

#define ptr2acc_data(/* acc_t * */ a) (bf_temporary_acc=(a), \
(&bf_temporary_acc->acc_buffer->buf_data_p[bf_temporary_acc-> \
acc_offset]))

ptr2acc_data() simply returns a pointer to the actual data within an accessor.

ptr2acc_data() is usually called so that the fields of a header (e.g., ip header) can be analyzed.


0015701                            cpvec[i].cpv_dst= (vir_bytes)dest;
0015702                            cpvec[i].cpv_size= size;
0015703                            i++;
0015704                   }
0015705 
0015706                   dest += size;
0015707                   acc= acc->acc_next;
0015708 
0015709                   if (i == CPVEC_NR || acc == NULL)
SYS_VCOPY is a message that a server (network service, file system, memory manager, etc.) sends to the system task to request a memory copy. Note that a user process cannot send this message. For a detailed description, see section 3.10 of Operating Systems, Design and Implementation.

For a SYS_VCOPY request, the vector can be a maximum of CPVEC_NR (16) elements. If the size of the vector is less than or equal to CPVEC_NR elements, all of the data from the user process can be copied with a single message. If not, multiple messages to SYSTASK must be sent.


0015710                   {
0015711                            mess.m_type= SYS_VCOPY;
0015712                            mess.m1_i1= this_proc;
this_proc is a global variable that is equal to the process number of the network service. this_proc was set during the initialization of the network service.


0015713                            mess.m1_i2= proc;
0015714                            mess.m1_i3= i;
0015715                            mess.m1_p1= (char *)cpvec;
0015716                            if (sendrec(SYSTASK, &mess) <0)
Send the message to SYSTASK (see OSDI, Section 3.10) requesting the copy and wait for it to finish.

The message will have the following format:

m_source m_type (SYS_VCOPY) src process dst process vec size (m1_i3) vc addr (char *m1p1) - -


Click here for a detailed description of sendrec().


0015717                                     ip_panic(("unable to sendrec"));
0015718                            if (mess.m_type <0)
The sendrec() attempt was unsuccessful.


0015719                            {
0015720                                     bf_afree(acc_ptr);
bf_afree()

After a chain of accessors is no longer needed, the chain (and not simply the single accessor passed as the parameter) can be freed by calling bf_free(). However, if either acc_linkC or buf_linkC of one of the accessors in the linked list is not equal to one (1), the entire chain will not be freed. For example, if buf_afree(acc1) is called for the following chain:



Then the resulting chain will be:



bf_afree() returns acc1 (accessors[63]) to acc_freelist (recall that acc_freelist is the linked list of acc_t's without an associated buffer). However, buffers512[127] cannot be freed because acc2 (accessors[64]) still references it.

bf_afree() is called after an accessor's associated data is no longer needed (for example, after a packet has been sent off by the ethernet driver).


0015721                                     return mess.m_type;
0015722                            }
0015723                            i= 0;
0015724                   }
0015725          }
0015726          bf_afree(acc_ptr);
bf_afree()

After a chain of accessors is no longer needed, the chain (and not simply the single accessor passed as the parameter) can be freed by calling bf_free(). However, if either acc_linkC or buf_linkC of one of the accessors in the linked list is not equal to one (1), the entire chain will not be freed. For example, if buf_afree(acc1) is called for the following chain:



Then the resulting chain will be:



bf_afree() returns acc1 (accessors[63]) to acc_freelist (recall that acc_freelist is the linked list of acc_t's without an associated buffer). However, buffers512[127] cannot be freed because acc2 (accessors[64]) still references it.

bf_afree() is called after an accessor's associated data is no longer needed (for example, after a packet has been sent off by the ethernet driver).


0015727          return OK;
0015728 }
0015729 
0015730 PRIVATE int sr_repl_queue(proc, ref, operation)
0015731 int proc;
0015732 int ref;
0015733 int operation;
sr_repl_queue()

If repl_queue is not empty, sr_repl_queue(proc, ref, operation) is called to send all messages (which will be REVIVE messages) in repl_queue to the file system. If called with proc, sr_repl_queue()'s first parameter, set to ANY (#define'd in include/minix/com.h - this value is not within allowable range of process numbers), sr_repl_queue() goes through repl_queue and sends each reply message to the appropriate process. If called with proc set to a specific process number (which is the case when sr_rec() is processing an NW_CANCEL message), sr_repl_queue() goes through repl_queue and sends every REVIVE message except for any messages destined for the process specified by proc (if any exist). sr_repl_queue() finally sends the message destined to the process specified by proc after all other messages have been sent. While the file system is still interested in the REVIVE message, the priority goes to other replies.

In addition to sending all the reply messages queued in repl_queue, sr_repl_queue() frees the messages in repl_queue for later use and sets repl_queue to NULL.

A return value of 1 indicates that a message in repl_queue was canceled. A return value of 0 indicates that no messages were canceled.

The second and third parameters (ref and operation) are never used in sr_repl_queue().


0015734 {
0015735          mq_t *m, *m_cancel, *m_tmp;
0015736          int result;
0015737 
0015738          m_cancel= NULL;
0015739 
0015740          for (m= repl_queue; m;)
Attempt to send each of the REVIVE messages in repl_queue to the file system.


repl_queue / repl_queue_tail


repl_queue is the head of the linked list (repl_queue_tail is the tail) of REVIVE messages that had been previously (unsuccessfully) sent to the file system (FS) but had not been delivered due to deadlock. Before any new messages are processed by the network service, a second (and last) attempt is made by send_repl_queue() to send out these REVIVE messages in repl_queue.

For a detailed description of what circumstances lead to a deadlock, click here.


0015741          {
0015742                   if (m->mq_mess.REP_PROC_NR == proc)
Send a REVIVE message to the process that just requested an NW_CANCEL last(see line 15128). This message is handled on lines 15759 - 15768.


0015743                   {
0015744 assert(!m_cancel);
0015745                            m_cancel= m;
0015746                            m= m->mq_next;
0015747                            continue;
0015748                   }
0015749 assert(m->mq_mess.m_source != MM_PROC_NR);
0015750 assert(m->mq_mess.m_type == REVIVE);
0015751                   result= send(m->mq_mess.m_source, &m->mq_mess);
Attempt to send the REVIVE message to the file system. This message will be followed by an NW_CANCEL message.

Click here for a detailed description of the send/receive mechanism.


0015752                   if (result != OK)
0015753                            ip_panic(("unable to send: %d", result));
0015754                   m_tmp= m;
0015755                   m= m->mq_next;
0015756                   mq_free(m_tmp);
mq_free()

mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.



Click here for a detailed description of mq_list[] and mq_freelist.


0015757          }
0015758          repl_queue= NULL;
0015759          if (m_cancel)
If a process just sent an NW_CANCEL message and there is a REVIVE message in repl_queue destined for this process, send this REVIVE message last.


0015760          {
0015761 assert(m_cancel->mq_mess.m_source != MM_PROC_NR);
0015762 assert(m_cancel->mq_mess.m_type == REVIVE);
0015763                   result= send(m_cancel->mq_mess.m_source, &m_cancel->mq_mess);
Attempt to send the REVIVE message to the file system. This message will be followed by an NW_CANCEL message.

Click here for a detailed description of the send/receive mechanism.


0015764                   if (result != OK)
0015765                            ip_panic(("unable to send: %d", result));
0015766                   mq_free(m_cancel);
mq_free()

mq_free(mq) places mq, mq_free()'s only parameter, back on mq_freelist.



Click here for a detailed description of mq_list[] and mq_freelist.


0015767                   return 1;
0015768          }
0015769          return 0;
0015770 }
0015771 
0015772 /*
0015773  * $PchId: sr.c,v 1.9 1996/05/07 21:11:14 philip Exp $
0015774  */