The Hackerspace

2. Basic Data Types

By Joker. December 15th, 2019. 3:59 PM

Check the guide's index here.

Objectives

Data types in C - char, int, float and double
Variables - declaration, rules for variable naming
Value atribution - simple and chained
Integers, Reals and Characters - Personal and common characteristics
Numeric Operators - +, -, *, / and %
Variable and expression reading/writing - printf, scanf and getchar functions
Reading and writing formats - %c, %d, etc.
Characters vs. Integers
Casting - Promoting expressions/variables into different data types
Some common errors

Intro

Every time we open our fridge, we look upon an enormous variety of recipients for all kinds of products: solid, liquid, regular, irregular, etc.

Each one of those recipients was designed and molded in a way to store a type of well defined good or product.

And so, we have cups and bottles for liquids, shelves with holes of the right diameter for eggs, and even a big set of plastic recipients whose characteristics and/or shapes are similar. For example, a when a jar is produced, it does not have the objective of keeping water or wine, but to store liquids in general.

In the same way that a square recipient with 20 cm of width can store (square) slices of ham, (round) slices of bologna or even the last slice from Mimi's cake, which, due to her young age, presentes a poor defined shape, not really because of its shape or consistency, but because of the beating it took because of the party guests, who had cut it and reshaped it with no real geometric or aesthetic sense.

You can easily verify that storing eggs is not the same as storing water or any other liquid. In the same way, how you measure a number of eggs (1, 2, 3, 12, ...) is not the same as you measure quantities of liquids (0.2, 1.5, ...). It makes no sense to speak of 0.32 eggs or two duzens of water.

Now, no one has a real necessity to take a course or read a book on which kinds of recipients better adapt themselves to what kind of product to store on your fridge. But this is theory you're learning first. And in C Programming, you need to understand that, when you want to store data, you have to choose the best recipients to store them in.

The different formats of recipients you use to store products in your fridge correspond, in C, to its Basic Data Types. These are only four - char, int, float and double - and will be now presented in detail./p>

Variables

Every time we want to store a value, and, for any reason, it does not have any set values, you do that using variables.

Note: A variable is nothing more than a name you give to a certain position in memory to contain a value of a certain type.

As its name indicates, the value contained inside a variable can vary during the length of a program's runtime.

A variable must always be defined before being used. A variable's definition tells the compiler which data type is atributed to the name we give that variable.

Defining variables is done using the following syntax:

type var1 [, var2, ..., varN];

Examples:

int i;  /* i is a variable of the integer type */ 
char ch1, new_char;  /* ch1 and new_char are vars of the char type */
float pi, ray, perimeter;
double total, k123;

Note: Variable declaration must always be done before its usage and before any instruction using it.

main()
{
  Variable Declaraction; <--
  
  Instruction_1;
  Instruction_2;
}

Variables are always stored in memory, and are a simple way to reference memory positions. The type that is associated to it indicates the no. of Bytes that will be used to store a value in that variable.

There is also another type - the pointer - that can also be considered a basic type. No need to worry though, there's a whole chapter dedicated to pointers coming up soon.

This way, a variable can be initialized through an atribution operation.

A value's atribution can only be done to variables. To do an atribution, the value previously stored in the variable is eliminated, with the new atributed value in its place.

It's this capacity that allows certain objects to store different values, which confers them the name of variables, i.e., it's content can vary during a program's runtime.

An atribution is done using the following syntax:

variable = expression;

Value atribution in C is done with the = character, being that the variable being altered is ALWAYS placed in the left side of the atribution, and value being atributed is on the right side.

Example: To place the value -17 in the variable num, you write:

int num;      /* Declaring the num variable */
num = -17;    /* num now stores the value -17 */

Note: A variable can be automatically intialized when you declare it.

The two previous lines could have been grouped into one unique line.

int num = -17;      /* num is declared as variable of int type and automatically */
                    /* initialized with the value -17 */

int n1=3, n2=5;     /* n1 and n2 are declared and take the values 3 and 5 */
                    /* respectively */

int a = 10, b, c = -123, d;
                    /* a and c are automatically initialized with the values 10
                     * and - 123. b and d take random values ("garbage") because
                     * they were not initialized.
                     */

Example: Place, in the variable val, the value stored in the variable num.

val = num;     /* val receives the value contained in num */

Note: In C, you can atribute the same value to multiple variables.

Example: Place the variable 5 in the previously declared variables a, b, c and d.

a = 5;
b = 5;
c = 5;
d = 5;

or, you can even do

a = b = c = d = 5;

This is only possible in C because, every time you do an atribution, the value atributed is returned (like in a function), being able to be used in other expressions or variables.

Let's see how the previous example works in detail.

Let's suppose the 4 variables were initialized with distinct values:

a = 1;     /* the variable a receives the value 1 */
b = 2;
c = 3;
d = 4;

What would be the value of the variables a, b, c and d if the next line were to be executed?

a = b = c = d = 5;

Two answers are normally presented by someone who looks at this instruction for the first time.

All of the variables take the value 5.
Variable a takes the value of b (2), b takes the value of c (3), c takes the value of d (4) and only d is atributed the value 5.

Apparently, Answer no. 2 seems to be the most consistent with what we know about programming, since atributions would be done from left to right, following the normal direction of instruction execution.

However, Answer no. 1 is actually the correct one. Now, why is that???

The reason is very simple and it has to do with the characteristics of the C language. When you write multiple chained atributions, these are done, not from left to right, but from right to left.

a = b = c = d = 5;
<-----------------

Let's then verify how this instruction is executed.

When you do atributions, they're done from right to left.
The first one being executed is d = 5.
As was mentioned before, the value atributed to d is returned as a result of the atribution.
a = b = c = <---- 5 <---- d = 5;
The value returned (5) is atributed to c.
This atribution returns its atribution value (5), and this value is atributed to b.
This process is repeated to variable a. This atribution also returns its atribution value, but since no other variable takes this returning value, it disappears into the void.

Integers - int

Variables declared with the integer value are used to store vlues that belong to the set of positive and negative natural numbers (with no fractionary parts). Ex.: 2, -345, +115, 0.

As was mentioned before, the definition of the variable num, of the integer type, is done with the following instruction:

int num;      /* What value does it take ??? */

Operations with integers

Being that we're talking about integer numbers, it's possible to execute a set of operations with them, and the result is always an integer value.

Operation	Description	Example	Result
+	Addition	`21 + 4`	`25`
-	Subtraction	`21 - 4`	`17`
*	Multiplication	`21 * 4`	`84`
/	Integer Division	`21 / 4`	`5`
+	Remainder of the Integer Division (Modulo)	`21 % 4`	`1`

In relation to addition, subtraction and multiplication, there's not much to say; the same can't be said about the / and % operators, though.

Note: Any operation between integers returns an integer value.

This way, the division between 21 and 4 would not result in 5.25, as one would think, since the result of an operation between two integers (21 and 4) always has to result in an integer.

The quotient of the division is obtained by the division operator (/) and the remainder of the division is obtained by the Modulo operator (%). We will now learn how you can write integers on screen.

Watch the next program carefully:

prog0201.c

1: #include <stdio.h>
2:
3: main(){
4:   int num=123;
5:
6:   printf("The value of num = %d and the next value is = %d\n", num, num+1);
7: }

In the program prog0201.c, we declare the integer type variable num, and we intialize it automatically with the value 123 in line 5:. As was mentioned before, variables always have to be declared before they're used and before any instructions (like the printf we used).

Next, we invoke the printf function with a set of weird parameters. Let's try, then, to understand what is passed to the printf function.

We want to write the following string on-screen:

  The value of num = 123 and the next value is = 124

This is, then, the string that should be passed to printf:

  "The value of num = 123 and the next value is = 124\n"

However, the value inside of num is stored in a variable, and we can't place the variable num inside of printf's display string, since printf would write the string num instead of the value stored in that variable.

What we want to write is actually:

  The value of num = <integer> and the next value is = <integer>

in which <integer> represents the integer value that's stored in a variable, constant or is returned by some expression.

Now, inside of a printf, every time we ant to write an integer value, we should first replace the value of that integer by a write format (remember that printf can print formatted text, hence its name) that, in that exact place will represent the integer we weant to write.

Note: The write format for integers in printf is %d.

Let's place the %d symbol in a place where we want to write our integers:

  The value of num = %d and the next value is = %d\n

All we need to do now is tell printf what values to place in the spots signed by a %d.

  printf("The value of num = %d and the next value is = %d\n", num, num+1)

To do so, we write the string we want to print out and place in order the variables or values that will be replaced by each %d, split by commas.

In this case, the first %d will be replaced by the value stored in the variable num, and the next %d will be replaced by the result of the expression num + 1.

As such, we obtain the desired output:

$ ./prog0201
The value of num = 123 and the next value is = 124
$

In the same way that we have the function printf for outputting values, we also have a corresponding function for inputting values - the scanf function.

prog0202.c

1: #include <stdio.h>
2: main(){
3:   int num;
4:
5:   printf("Input a number: ");
6:   scanf("%d", &num);
7:   printf("The inputted number is %d\n", num);
8: }

The scanf function (formatted scanning) functions in a similar way to the printf function. Since she was implemented for reading values, the initial string must only contain the format of the values we want to read.

After specifying the reading formats in the string, you must place all of the corresponding variables in the order that the formats are presented, all preceded by an & (except if they're strings).

Note: To read any variable of the int, char, float and double types using the scanf function, you'll need to precede each variable with an & (commercial "AND"). If this is not done, the program's execution may have unexpected results. The reason for this is going to be thoroughly explained in the pointers chapter.

In the case of the previous program, we want to read the value of a variable. To do that, we use the function

  scanf()

This function's first parameter is a string with the needed read formats. Since we want to read only one variable, it will contain only one read format. Since the variable we want to read is an integer, the reading format will be %d.

  scanf("%d")

Next, we have to indicate which variable will receive the integer value we want to read. This variable, since it's an integer, needs to be written as a parameter, preceded by an &.

  scanf("%d", &num);

And just like that, we obtained line 8: which allows us to read an integer and store it in a variable.

The integer, after being read, is stored in the variable num and its value is then written on screen by the printf function.

Note: The string sent to the scanf function must not contain any other characters that do not form format indicating symbols. One common error is to end the string in \n, which is completely wrong and makes the scanf function to not finish reading the values once they're introduced.

Quick question (and don't cheat): what does the next program do?

prog0203.c

1: #include <stdio.h>
2: void main(){
3:   int n1, n2;
4:   printf("Input two numbers: ");
5:   scanf("%d%d", &n1, &n2);
6:   printf("The result of %d + %d = %d\n", n1, n2, n1+n2);
7: }

$ ./prog0203
Input two numbers: 12 45
The result of 12 + 45 = 57
$

In this case we declare two integer variables (n1 and n2). We ask for two values to be inputted. Two integers are read and stored in n1 and n2.

Next we present the value of the sum of the two read integers:

printf("The result of %d + %d = %d\n", n1, n2, n1+n2)

Analysing the printf, it will write "The result of"
Next, it will replace the first %d with the value of n1
It keeps writing the characters of the string - " + "
Next, it will replace the second %d with the value of n2
It keeps writing the characters of the string - " = "
Next, it will replace the third %d with the returning value of the sum of n1+n2
Finally, it writes the final character of the string - "\n" (changes lines)

Integers and Variations

As was previously mentioned, the size of an integer in Bytes varies from one architecture to the other, and the most common sizes are 2 or 4 Bytes.

It's important to know the size of an integer when you develop an application, otherwhise, you run the risk of trying to store a value in an integer variable with an insufficient number of Bytes.

To know the size of an integer (or any other data type or variable), C allows the usage of an operator named sizeof, whose syntax is simlar to the one used to invoke functions.

The sizeof operator's syntax is:

sizeof <expression> or sizeof ( <type> )

Example: Write a program that indicates how many Bytes an integer occupates in memory.

prog0204.c

1: #include <stdio.h>
2: void main(){
3:   printf("The size (in Bytes) that an integer occupates in memory is %d", sizeof(int));
4: }

As a result of this program's execution on a microcomputer, you get:

> prog0204
The size (in Bytes) that an integer occupates in memory is 2
>

In the case that the program is executed on a Unix Machine (for example), the result would probably be:

$ ./prog0204
The size (in Bytes) that an integer occupates in memory is 4
$

If you want to know the size in Bytes of ALL Basic Data Types in C, you would only need to alter the program in order to check also the size of the char, float and double types.

Example: Write a program that indicates the no. prog0205.c

1: #include <stdio.h>
2: void main(){
3:   printf("The size (in Bytes) of a char = %d", sizeof(char));
4:   printf("The size (in Bytes) of an int = %d", sizeof(int));
5:   printf("The size (in Bytes) of a float = %d", sizeof(float));
6:   printf("The size (in Bytes) of a double = %d", sizeof(int));
7: }

$ ./prog0205
The size (in Bytes) of a char = 1
The size (in Bytes) of an int = 2
The size (in Bytes) of a float = 4
The size (in Bytes) of a double = 8
$

The fact that an integer's size can vary is somewhat preocupating, because the limits of variables which store integers can vary drastically, strongly reducing the portability of programs between different machines. Let's note the difference in the values that a variable can contain.

No. of Bytes	Smallest Value	Largest Value
2	-32 768	32 767
4	-2 147 483 648	2 147 483 647

How can we, then, guarantee that a program written by us always uses 2 or 4 Bytes of memory to store an integer, if an integer's size varies from one machine to the other?

Well, when declaring an integer, we can use 4 distinct prefixes, to better define the characteristics of the variable.

short integer - 2 Bytes
long integer - 4 Bytes
signed integer - contains negative and positive numbers
unsigned integer - only contains positive numbers (size cut by half)

short and long

To guarantee that the n integer uses only 2 Bytes of memory, independently of the architecture used, we should declare the variable as:

 short int n; /* or short n; */

To guarantee that the n integer always uses 4 Bytes of memory, independently of the architecture used, we should declare the variable as:

 long int n; /* or long n; */

The prefix short guarantees a minimal integer size, and the long prefix guarantees its largest size, independently of what integer size is used. This way:

short int	int	long int
2	2	4
2	4	4

(size in Bytes)

Note: The read and write formats of short and long integer variables in the scanf and printf functions should be preceded by the h (short) and l (long) prefixes.

Example: Write a program that asks the user to input an age, amount to deposit and the account no. in which the deposit is going to be made, declaring the variables as short, int and long.

prog0206.c

 1: #include <stdio.h>
 2: void main(){
 3:   short int age; /* or short age; */
 4:   int amount;
 5:   long int account_n; /* or long account_n; */
 6:
 7:     printf("What is your age: "); scanf("%hd", &age);
 8:     printf("How much to be deposited: "); scanf("%d", &amount);
 9:     printf("What is the account number: "); scanf("%ld", &account_n);
10:
11:     printf("A %hd year old person deposited %d Money in the account with the number %ld\n", age, amount, account_n);
12: }

$ ./prog0206
What is your age: 19
How much to be deposited: 1500
What is the account number: 123456789
A 19 year old person deposited 1500 Money in the account with the number 123456789
$

signed and unsigned

By default, an integer type variable admits positive and negative integer values.

For example, if an integer is stored in 2 Bytes of memory, its values can range from -32768 to 32767.

In case you only want the variable to contain positive values, it can be declared with the unsigned prefix.

Example:

 unsigned int Age; /* or unsigned Age; */
                  /* The age of an individual cannot be negative */

Note: The signed prefix is not necessary before an integer, since, by default, all integers are signed when created.

Supposing that the size of an integer are 2 Bytes, I now present you the list of limits in which an integer variable can vary, depending of used prefixes.

Type of Variable	No. of Bytes	Smallest Value	Largest Value
`int`	2	-32 768	32 767
`short int`	2	-32 768	32 767
`long int`	4	-2 147 483 648	2 147 483 647
`unsigned int`	2	0	65 535
`unsigned short int`	2	0	65 535
`unsigned longint`	4	0	4 294 967 295

Reals - float and double

Variables declared as float or double are used to store numeric values with a fractionary part. They're also frequently called real numbers or floating point numbers (Ex.: 3.14, 0.0000024514, 1.0).

The difference between a float type variable and a double type variable is the number of bytes that each allocate to store the value. A float's size is normally 4 Bytes, while a double holds 8 Bytes. Usually, these two types are also refered for being able to store numbers with normal precision (float) or double precision (double).

A floating point is represented as having an integer part an a decimal part, seperated by a dot (and not a comma, as is usual in some math classes).

  float Pi = 3.1415;
  double error = 0.000001;
  float total = 0.0;

Example: Write a program that can calculate the perimeter and area of a circumpherence.

prog0207.c

 1: #include <stdio.h>
 2: void main(){
 3:   float ray, perimeter;
 4:   double Pi = 3.1415927, area;
 5:
 6:     printf("Input the Ray of the Circumpherence: ");
 7:     scanf("%f", &ray);
 8:     area = Pi * ray * ray;
 9:     perimeter = 2 * Pi * ray;
10:
11:     printf("Area = %f a.u.\nPerimeter = %f m.u.\n", area, perimeter);
12: }

$ ./prog0207
Input the Ray of the Circumpherence: 1500
Area = 7068583.5750000 a.u.
Perimeter = 9424.778320 m.u.
$

In the previous example, two simple precision variables were declared (ray and perimeter), along with two double precision variables (Pi and area). In the case of the variable Pi, it actually behaves more as a constant during the program's execution. Later, we will learn how to work with constants.

It will be convinient to remember that, relative to a circumpherence, Area = π*r^2 and that Perimeter = 2*π*r.

There is no operator that allows you to calculate the square of a number. It must be done by multiplying that number by itself, or by using a library function (specifically, pow from the math.h header file, which computes the power of a number).

After reading the value of the ray with the %f format, you calculate the area and the perimeter. Next, we display the obtained results.

The attribution, reading and writing of real numbers can be executed using scientific notion, specifying a base and an exponent. In this case, the number should be written as:

The stored number in the above example is 123.46 * 10^78.

Example: Write a program that executes the conversion from tons to kilos and grams, writing the result in traditional (aaaa.bbb) and scientific notation (aaa Ebbb).

prog0208.c

 1: #include <stdio.h>
 2: void main(){
 3:   float kilos = 1.0E3; /* one ton is 1,000 kilos */
 4:   double grams = 1.0e6; /* one ton is 1,000,000 grams */
 5:   float n_tons;
 6:
 7:   printf("How many tons: "); scanf("%f", &n_tons);
 8:   printf("No. of Kilos = %f = %e\n, n_tons * kilos, n_tons * kilos");
 9:   printf("No. of Grams = %f = %E\n", n_tons * grams, n_tons * grams);
10: }

$ ./prog0208
How many tons: 134.567
No. of Kilos = 134567.001343 = 1.345670e+05
No. of Grams = 134567001.342773 = 1.345670E+08
$

In the previous program, we declared two real variables kilos and grams to contain the values of one thousand and 1 million, which can be written in traditional format (1000.0 and 1000000.0) or in scientific notation (1.0E3 and 1.0E6).
The user is queried to input a determined number of tons.
Next, we write the number of kilos and the number of grams corresponding to the no. of tons introduced (in traditional format and in scientific format).

Note: When values are stored in floating point numbers can compor a minimal error, resulting from rounding the number, or even the internal format of its representation.

In the previous example, two different writing formats were used for scientific notation - %e and %E. The difference is in the way the values are then displayed (with a lowercase e - 1.2e+5; or with an uppercase E - 1.2E+5).

Operations with Reals

Any operation in which one of the operands is a real number will return a real type result. The difference between an integer and a real is the presence or absence of the dot - seperator of the integer and fractionary parts.

10	Integer 10
-10	Integer -10
10.	Real 10.0 (because of the dot)
10.0	Real 10.0 (identical to the previous one)
10.25	Real 10.25

The full set of available operations for floating point numbers is the same for integer numbers (except for the % operator - Modulo).

Operation	Description	Example	Result
+	Addition	`21.3 + 4.1`	`25.4`
-	Subtraction	`21.7 - 4.8`	`16.9`
*	Multiplication	`21.2 * 4.7`	`99.64`
/	Division	`21.0 / 4.0`	`5.25`
%	Makes no sense to apply to reals	`n.a.`	`n.a.`

Note: Any operation in which one of the operands is a real number produces a real number. If any of the operands is, for example, an integer, and the other is a real number, the integer is promoted to real (4 -> 4.0) so that the operation is executed between two real numbers.

Examples:

21   / 4   -> 5    /* Integer Division */
21.0 / 4   -> 5.25 /* Since 21.0 is a real number, 4 is promoted to 4.0 */
21   / 4.  -> 5.25 /* Since 4. is a real number, 21 is promoted to 21.0 */
21.0 / 4.0 -> 5.25 /* Real Division */

This way, the division of 21 by 4 will not result in 5.25, as one would be led to believe, since the result of an operation between two integers (21 and 4) would always return an integer. For the result to have decimal places, at least one of the operands has to be a real number.

The remainder of the division (% - Modulo) cannot be applied to real numbers, since it no longer makes sense to apply an operator whose result will always be 0 (zero).

This operator % cannot be applied to real numbers because (save for number rounding errors inherent to their own operations), the remainder of a division between two real numbers will always be zero, since the quotient of a real division contains the decimal places that allow us to represent the result of the division in the most accurate way possible.

If you try to apply the % operator in which one of the operands is a real number, the compiler will present the respective error and suspend the compiling process.

Characters - char

The char type allows you to store a variable of this type, ONLY ONE CHARACTER.

The reason why I enphasize the expression ONLY ONE CHARACTER is because one of the most common C programming mistakes is to think that the char type allows you to store strings or sets of characters in just one variable of the char type.

I'll say it again: a variable of the char type can contain only one character at a time.

If, in relation to integers and reals, there is any doubt about the size allocated to represent them, in the case of the char, independently of the used architecture, it's always stored in only one Byte.

Note: A char is always stored in 1 Byte.

This way, the number of possible characters to represent is 256, because it's the number of possible combinations to form in one single Byte (0-255).

00000000 - Every Bit at 0 (value 0).

11111111 - Every Bit at 1 (value 255).

< 1. My First Program

3. Tests and Conditions >