Manually patching source code    Sameer N. Ingole (SNI) Status: Unmainntained
How-to patch source code manually Last Updated: 7 July 2005
Introduction

Patches are created using the diff command. It is basically used to show/find differences between two files. It can output differences between files line by line in several format, which can be selected by command line options. This set of differences is often called patch or diff.

If two files are identical, diff will not produce any output. Identical meaning even to the spaces and empty lines and not just text or content. Here when I refer to word difference, I mean its a series of lines those were changed, newly inserted, or deleted in one file to produce another file. diff can output these differences in many formats. For which one suites you most depends on its utility. For more, you may want to refer to manpage for diff.

Terminology hunk

When comparing two files diff produces sequences of line common in both files, mixed with groups of differing lines called hunks. When diff compares two identical lines, it produces a sequence of common lines and not hunks, meaning no group of lines with some common and defferring lines. Comparing two entirely different files with nothing identical will yeild no common lines and one big hunk that contains all the lines of both files..

diff tries to minimize the total hunk size by finding large sequences of common lines scattered with small hunks of differring lines. Suppressing lines with differences in spacing and tabs depends on the switches used.

patch takes comparison output produced by diff and applies the differences to a copy of the original file, producing a patched version. With patch, you can distribute just the changes to a set of files instead of distributing the entire file set; your correspondents can apply patch to update their copy of the files with your changes. patch automatically determines the diff format, skips any leading or trailing headers, and uses the headers to determine which file to patch.

context

Usually, when we look at the differences between files, we will also want to see the parts of the files near the lines that differ, to help us understand exactly what has changed. These nearby parts of the files are called the context. The context output format shows several lines of context around the lines that differ. It is the standard format for distributing updates to source code.

patch can detect when the line numbers mentioned in the patch are incorrect, and it attempts to find the correct place to apply each hunk of the patch. As a first guess, it takes the line number mentioned in the hunk, plus or minus any offset used in applying the previous hunk. If that is not the correct place, patch scans both forward and backward for a set of lines matching the context given in the hunk.

fuzz

First patch looks for a place where all lines of the context match. If it cannot find such a place, and it is reading a context or unified diff, and the maximum fuzz factor is set to 1 or more, then patch makes another scan, ignoring the first and last line of context. If that fails, and the maximum fuzz factor is set to 2 or more, it makes another scan, ignoring the first two and last two lines of context are ignored. It continues similarly if the maximum fuzz factor is larger.

If patch cannot find a place to install a hunk of the patch, it writes the hunk out to a reject file (see Reject Names, for information on how reject files are named). It writes out rejected hunks in context format no matter what form the input patch is in. The line numbers on the hunks in the reject file may be different from those in the patch file: they show the approximate location where patch thinks the failed hunks belong in the new file rather than in the old one.

An Example

As an example, let us consider this code. When we apply the qmail-queue-custom-error.patch to the patched code of qmail-ldap you get following output.

$ patch < ../qmail-queue-custom-error.patch
patching file qmail.c
Hunk #1 FAILED at 14.
Hunk #2 FAILED at 45.
Hunk #3 succeeded at 152 with fuzz 2 (offset 59 lines).
Hunk #4 succeeded at 200 (offset 59 lines).
2 out of 4 hunks FAILED -- saving rejects to file qmail.c.rej
patching file qmail.h

Here is the second hunk from reject file qmail.c.rej

***************
*** 35,40 ****

    qq->fdm = pim[1]; close(pim[0]);
    qq->fde = pie[1]; close(pie[0]);
    substdio_fdbuf(&qq->ss,write,qq->fdm,qq->buf,sizeof(qq->buf));
    qq->flagerr = 0;
    return 0;
--- 45,51 ----

    qq->fdm = pim[1]; close(pim[0]);
    qq->fde = pie[1]; close(pie[0]);
+   qq->fderr = pierr[0]; close(pierr[1]);
    substdio_fdbuf(&qq->ss,write,qq->fdm,qq->buf,sizeof(qq->buf));
    qq->flagerr = 0;
    return 0;

Let us try to understand this hunk. Actually this is pretty easy to understand. In the above hunk, lines

***************
*** 35,40 ****

tell that in original file (the file to which we tried to apply the patch), look from line 35 in the range of 40 lines i.e. till line no 75. Similarly in the lines below

--- 45,51 ----

    qq->fdm = pim[1]; close(pim[0]);
    qq->fde = pie[1]; close(pie[0]);
+   qq->fderr = pierr[0]; close(pierr[1]);
    substdio_fdbuf(&qq->ss,write,qq->fdm,qq->buf,sizeof(qq->buf));
    qq->flagerr = 0;
    return 0;

first line --- 45,51 ---- indicates that we have to search from line number 45 within the range of 51 lines i.e. till line no 45+51 from line number 45. The + sign on in the line + qq->fderr = pierr[0]; close(pierr[1]); indicates that this line is to be added to the original file (in this case qmail.c). If there is a - sign in front of a line, then that line is to deleted from the original file. You have to manually insert the line marked with + at start of the line.

This way we will have to edit the file hunk by hunk from rejects file, until we exhaust all the hunks in reject file.