Showing posts with label C. Show all posts
Showing posts with label C. Show all posts

Thursday, November 17, 2011

Your own linker warnings using the GNU toolchain


You've probably seen linker warnings like these:

linkwarnmain.c:(.text+0x1d): warning: foo is deprecated, please use the shiny new foobar function
linkwarnmain.c:(.text+0x27): warning: the use of `tmpnam' is dangerous, better use `mkstemp'

These warnings are actually stored in ELF sections named .gnu.warning.symbolname:

$ objdump -s -j .gnu.warning.gets /lib/libc.so.6

/lib/libc.so.6:     file format elf32-i386

Contents of section .gnu.warning.gets:
 0000 74686520 60676574 73272066 756e6374  the `gets' funct
 0010 696f6e20 69732064 616e6765 726f7573  ion is dangerous
 0020 20616e64 2073686f 756c6420 6e6f7420   and should not
 0030 62652075 7365642e 00                 be used..

So when you try to use a symbol, if the linker sees a section whose name matches the above pattern, it emits the corresponding warning message.

You can add your own linker warnings to your source files:

linkwarn.c
void foo(void)
{
}

static const char foo_warning[] __attribute__((section(".gnu.warning.foo"))) =
        "foo is deprecated, please use the shiny new foobar function";

linkwarnmain.c
void foo(void);

int main()
{
        foo();

        return 0;
}

Then, when you compile, you will get a warning:

$ gcc -Wall -c linkwarn.c
$ gcc -Wall -o linkwarnmain linkwarnmain.c linkwarn.o
/tmp/ccyHLTw6.o: In function `main':
linkwarnmain.c:(.text+0x1d): warning: foo is deprecated, please use the shiny new foobar function

glibc machinery for emitting linker warnings is a little bit more complicated:

libc-symbols.h
...
#ifdef HAVE_ELF

/* We want the .gnu.warning.SYMBOL section to be unallocated.  */
# ifdef HAVE_ASM_PREVIOUS_DIRECTIVE
#  define __make_section_unallocated(section_string)    \
  asm (".section " section_string "\n\t.previous");
# elif defined HAVE_ASM_POPSECTION_DIRECTIVE
#  define __make_section_unallocated(section_string)    \
  asm (".pushsection " section_string "\n\t.popsection");
# else
#  define __make_section_unallocated(section_string)
# endif

/* Tacking on "\n\t#" to the section name makes gcc put it's bogus
   section attributes on what looks like a comment to the assembler.  */
# ifdef HAVE_SECTION_QUOTES
#  define __sec_comment "\"\n\t#\""
# else
#  define __sec_comment "\n\t#"
# endif
# define link_warning(symbol, msg) \
  __make_section_unallocated (".gnu.warning." #symbol) \
  static const char __evoke_link_warning_##symbol[]     \
    __attribute__ ((used, section (".gnu.warning." #symbol __sec_comment))) \
    = msg;
...

The warning message is marked as used so the optimizer doesn't decide to completely optimize it out of existence. Also, the section is not marked as allocatable to prevent the loader from loading it into memory. Since gcc's section attribute doesn't allow to change its default flags, the solution is to declare the section using asm, return to the previous section (the one the compiler was using beforehand), and prevent (via __sec_comment) the assembler from seeing the flags that gcc adds in its section attribute output. Otherwise we'd have something similar to:

$ gcc -Wall -S linkwarn.c
$ cat linkwarn.s
        .file   "linkwarn.c"
        .text
.globl foo
        .type   foo, @function
foo:
        pushl   %ebp
        movl    %esp, %ebp
        popl    %ebp
        ret
        .size   foo, .-foo
        .section        .gnu.warning.foo,"a",@progbits
        .align 32
        .type   foo_warning, @object
        .size   foo_warning, 60
foo_warning:
        .string "foo is deprecated, please use the shiny new foobar function"
        .section        .note.GNU-stack,"",@progbits
        .ident  "GCC: (GNU) 3.4.5 (Gentoo 3.4.5-r1, ssp-3.4.5-1.0, pie-8.7.9)"


... where the section is marked as allocatable via the "a" flag.

Monday, October 17, 2011

dmr


I remember SuSE Linux had a motd that simply said: "Have fun..." That has always been the spirit of Unix, having an environment in which it is fun to work and tinker about. We owe that to Ken Thompson, Dennis Ritchie and their colleagues at Bell Labs (http://www.princeton.edu/~hos/Mahoney/unixpeople.htm).

Dennis Ritchie also evolved Thompson's B into C, giving us a concise and practical systems programming language. Kernighan and Ritchie taught many people how to write great code, some of those people taught other people, directly, or simply by sharing great code. The end result has been a ripple effect that has had a profound and positive impact on how many of us code. Not many people in history have influenced the work of others so much.

Dennis Ritchie is no longer with us, he will be greatly missed. He is among those luminaries we will never forget. The computing community will never forget Jon Postel, Richard Stevens and Dennis Ritchie.

Obituaries:
http://www.nytimes.com/2011/10/14/technology/dennis-ritchie-programming-trailblazer-dies-at-70.html?hp
http://www.guardian.co.uk/technology/2011/oct/13/dennis-ritchie
http://www.bbc.co.uk/news/technology-15287391

Monday, October 10, 2011

Tentative Definitions in C



Suppose you want to access a variable from several translation units. You would declare it in a header file:

example.h
int foo;

You would optionally define it on a translation unit:

example.c
int foo = 2;

And you would refer to it on some other(s) translation unit(s):

example-main.c
#include <stdio.h>

#include "example.h"

int main()
{
        printf("foo = %d\n", foo);

        return 0;
}

If we compile and run the above we get:


$ gcc -Wall -o example example-main.c example.c
$ ./example
foo = 2

int foo; from header.h is a tentative definition. If you comment out the definition in example.c, the tentative definition acts as a definition with an initializer of 0:


$ gcc -Wall -o example example-main.c example.c
$ ./example
foo = 0

But it can be considered to be better style to restrict header files to contain declarations, and not definitions, not even tentative ones. In that case, we would add a extern storage specifier, which would turn our declaration on the header file from a tentative definition to a mere declaration:

example.h
extern int foo;

In this case, if we don't define foo in any translation unit, the linker will error out:


$ gcc -Wall -o example example-main.c example.c
/tmp/ccS3FMxx.o: In function `main':
example-main.c:(.text+0x1d): undefined reference to `foo'
collect2: ld returned 1 exit status

All of the above is for C. C++ doesn't have tentative declarations, and therefore has some different rules on what constitutes a definition. In C++, int foo; constitutes a definition, while extern int foo; constitutes a declaration. So, if we leave out the extern storage specifier in C++, the linker will error out:


$ g++ -Wall -o example example-main.cpp example.cpp
/tmp/cc8fzx4x.o:(.data+0x0): multiple definition of `foo'
/tmp/ccxLfXPo.o:(.bss+0x0): first defined here
collect2: ld returned 1 exit status

Saturday, July 2, 2011

Integer Promotions and Conversions in C

Suppose you have the following code:
#include <stdio.h>

int main()
{
    unsigned x = 1;
    char y = -1;

    if (x > y)
        printf("x > y\n");
    else
        printf("x <= y\n");

    return 0;
}
What does really happen here? Since we are dealing with integer types, first the integer promotions are applied, and then the arithmetic conversions are applied.

If char is equivalent to signed char:
  • char is promoted to int (Integer Promotions, ISO C99 §6.3.1.1 ¶2)
  • Since int and unsigned have the same rank, int is converted to unsigned (Arithmetic Conversions, ISO C99 §6.3.1.8)
If char is equivalent to unsigned char:
  • char may be promoted to either int or unsigned int:
    • If int can represent all unsigned char values (typically because sizeof(int) > sizeof(char)), char is converted to int.
    • Otherwise (typically because sizeof(char)==sizeof(int)), char is converted to unsigned.
  • Now we have one operand that is either int or unsigned, and another that is unsigned. The first operand is converted to unsigned.

The rules that are applied are the following:

Integer promotions: An expression of a type of lower rank that int is converted to int if int can hold all of the values of the original type, to unsigned otherwise.

Arithmetic conversions: Try to convert to the larger type. When there is conflict between signed and unsigned, if the larger (including the case where the two types have the same rank) type is unsigned, go with unsigned. Otherwise, go with signed only in the case it can represent all the values of both types.

Conversions between integer types(ISO C99 §6.3.1.3):
  • Conversion of an out-of-range value to an unsigned integer type is done via wrap-around (modular arithmetic).
  • Conversion of an out-of-range value to a signed integer type is implementation defined, and can raise a signal (such as SIGFPE).
Ranks: Every type has a rank. unsigned types have the same rank as the corresponding signed type. Ranks satisfy the following: char < short < int < long < long long.

Representation of integer types:
  • Signed types consist of sign bits, padding bits and value bits.
  • Unsigned types consists of padding bits and value bits.
  • An unsigned type has to have a number of value bits greater or equal to the number of value bits of its corresponding signed type.
Now we can rephrase part of the above as:
  • unsigned char may be promoted to either int or unsigned int:
    • If int can represent all unsigned char values (because the number of value bits of int >= number of value bits of unsigned char), unsigned char is converted to int.
    • Otherwise (because the number of value bits of int < number of value bits of unsigned char), unsigned char is converted to unsigned.
Borderline example: a system with an unsigned char type with 1 padding bit and 31 value bits, and an int type with 1 sign bit and 31 value bits would fall into the first condition (and be an exception to the previous rule of thumb using sizeof()).

Saturday, June 25, 2011

Character by Character Input on *nix

A frequently asked question is how to read a single key on *nix. Many people expect read(2) of a single byte to return immediately, but by default it doesn't.

On Unix you don't deal with keyboards, you deal with a terminal. A terminal can be the physical console, a terminal (or terminal emulator) on a serial port, an xterm, ...

By default, input lines are not made available to programs until the terminal (or terminal emulator) sees a line delimiter.

On POSIX, terminals are controlled by the termios(3) functions. These include tcgetattr(3) and tcsetattr(3), which can be used to modify terminal attributes. These include input modes, output modes, and local modes.

One of the local modes is canonical mode (enabled by default), in which input is made available line by line, and certain line editing characters are enabled.

So, to read input character by character, you need to do something like the following:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

#include <unistd.h>
#include <poll.h>
#include <signal.h>
#include <termios.h>
#include <sys/ioctl.h>

static volatile sig_atomic_t end = 0;

static void sighandler(int signo)
{
        end = 1;
}

int main()
{
        struct termios oldtio, curtio;
        struct sigaction sa;

        /* Save stdin terminal attributes */
        if (tcgetattr(0, &oldtio) < 0) {
                perror("tcgetattr");
                exit(1);
        }

        /* Make sure we exit cleanly */
        memset(&sa, 0, sizeof(struct sigaction));
        sa.sa_handler = sighandler;
        if (sigaction(SIGINT, &sa, NULL) < 0) {
                perror("sigaction");
                exit(1);
        }
        if (sigaction(SIGQUIT, &sa, NULL) < 0) {
                perror("sigaction");
                exit(1);
        }
        if (sigaction(SIGTERM, &sa, NULL) < 0) {
                perror("sigaction");
                exit(1);
        }

        /* This is needed to be able to tcsetattr() after a hangup (Ctrl-C)
         * see tcsetattr() on POSIX
         */
        memset(&sa, 0, sizeof(struct sigaction));
        sa.sa_handler = SIG_IGN;
        if (sigaction(SIGTTOU, &sa, NULL) < 0) {
                perror("sigaction");
                exit(1);
        }

        /* Set non-canonical no-echo for stdin */
        if (tcgetattr(0, &curtio) < 0) {
                perror("tcgetattr");
                exit(1);
        }
        curtio.c_lflag &= ~(ICANON | ECHO);
        /* This could be interrupted by a signal if it used
         * TCSADRAIN or TCSAFLUSH, but it wouldn't matter, since
         * we would have not changed terminal attributes yet
         */
        if (tcsetattr(0, TCSANOW, &curtio) < 0) {
                perror("tcsetattr");
                exit(1);
        }
        if (tcgetattr(0, &curtio) < 0) {
                perror("tcgetattr");
                exit(1);
        }
        if (curtio.c_lflag & (ICANON | ECHO)) {
                fprintf(stderr, "couldn't set non-canonical no-echo mode\n");
                exit(1);
        }

        /* main loop */
        while (!end) {
                struct pollfd pfds[1];
                int ret;
                char c;

                /* See if there is data available */
                pfds[0].fd = 0;
                pfds[0].events = POLLIN;
                ret = poll(pfds, 1, 0);
                if (ret < 0 && errno != EINTR) {
                        perror("poll");
                        exit(1);
                }

                /* Consume data */
                if (ret > 0) {
                        printf("Data available\n");
                        if (read(0, &c, 1) < 0 && errno != EINTR) {
                                perror("read");
                                exit(1);
                        }
                }
        }

        /* restore terminal attributes */
        /* This could be interrupted by a signal if it used TCSADRAIN
         * or TCSAFLUSH
         */
        if (tcsetattr(0, TCSANOW, &oldtio) < 0) {
                perror("tcsetattr");
                exit(1);
        }

        return 0;
}