Section: Bit Manipulations - Bit Count#
Adapted from: gjbex/Fortran-MOOC
This program demonstrates bit manipulations using Fortran.
Summary Code Explanation Explanation by ChatGPT 4o#
The given programs and module (program bitcount
and module bitmanip_mod
) provide a framework to evaluate and compare multiple methods of counting set bits (1’s) in integers. Here’s a detailed breakdown of their functionality:
Program bitcount
#
This is the main driver program that performs the following:
Sanity Check:
Iterates over a range of integers from
-16
to16
.For each integer, it calculates the number of set bits using four different methods:
naive_count_bits
early_stopping_count_bits
lookup_table_count_bits
kernighan_count_bits
Prints the results of these methods alongside the binary representation of the number using the
bit_repr
function.
Benchmarking:
Reads the number of values (
nr_vals
) from the command line using the subroutineget_arguments
.Allocates an array
vals
to store random integers for testing.Fills
vals
with random integers.Times each of the four bit-counting implementations (
naive
,early stopping
,lookup table
, andKernighan
) over the generated dataset.Prints the timing results for comparison.
Subroutine
get_arguments
:Validates and extracts the number of integers (
nr_vals
) to process from the command-line arguments.Provides error handling if the input is invalid.
Module bitmanip_mod
#
This module contains the bit-counting algorithms, bit representation logic, and utilities for working with a lookup table.
Key Components:#
Lookup Table Management:
A 256-element lookup table (
lookup_table
) is used to optimize bit-counting for bytes (8-bit chunks).The
ensure_lookup_table_initialized
subroutine ensures the table is loaded from a CSV file before use.initialize_lookup_table
reads values fromlookup_table.dat
using a CSV reader (csv_module
).
Utility Functions:
bit_repr
: Converts an integer into its binary string representation.remove_trailing_comma_ampersand
: Cleans up lines from the CSV file by removing trailing, &
for compatibility.
Bit-Counting Methods:
naive_count_bits
:Counts bits by iteratively checking the least significant bit using
and(n, 1)
and shifting the number (ishft
).Iterates a fixed number of times (32 for 32-bit integers).
early_stopping_count_bits
:Similar to
naive_count_bits
, but exits early when all remaining bits are 0.
lookup_table_count_bits
:Divides the integer into four 8-bit chunks.
Uses the lookup table to sum the number of set bits in each chunk.
This is faster because it avoids iterative bit operations.
kernighan_count_bits
:Uses Brian Kernighan’s algorithm, which repeatedly clears the least significant set bit (
n = and(n, n - 1)
) until the number becomes 0.The number of iterations equals the number of set bits.
CSV File Parsing:
parse_csv_file
: Reads a CSV file, preprocesses it to remove trailing artifacts, and populates the lookup table.Uses the
csv_module
library for handling CSV data.
Workflow#
Sanity Check:
Validates that all methods produce the same results for small integers (-16 to 16).
Benchmarking:
Tests the efficiency of each method over a large dataset of random integers, highlighting performance differences.
Bit Manipulation Techniques:
Each method showcases different trade-offs between simplicity, performance, and implementation complexity:
Naive: Simple but slow.
Early Stopping: Optimized for sparse integers with fewer set bits.
Lookup Table: Fast for large datasets, relies on preprocessing and memory.
Kernighan: Efficient in terms of operations for sparse integers.
Results:
Binary representations and bit counts are printed for verification.
Execution times are reported for performance analysis.
Key Features and Highlights#
Modular Design: The
bitmanip_mod
encapsulates reusable logic, making the implementation extendable and maintainable.Error Handling: Comprehensive checks for memory allocation, file reading, and input validity.
Performance Comparison: Benchmarks allow empirical evaluation of different approaches, useful for understanding their practical efficiency.
Lookup Table Preprocessing: Demonstrates efficient preprocessing for rapid bit-counting.
Applications#
This program can be used to:
Analyze and optimize bitwise operations in software.
Serve as a teaching tool for bit manipulation techniques.
Benchmark and compare algorithmic performance for counting set bits.
Detailed Analysis by ChatGPT 4o1: bitcount.f90
#
Below is an extremely detailed, section-by-section analysis of bitcount.f90
, which serves as a driver program to demonstrate and benchmark various bit-counting methods defined in the companion module bitmanip_mod
. We’ll examine the program’s flow, purpose of each block of code, and some considerations for performance and design.
1. Program Declaration and Imports#
program bitcount
use, intrinsic :: iso_fortran_env, only : error_unit
use :: bitmanip_mod
implicit none
program bitcount
: Declares the main program unit namedbitcount
.use, intrinsic :: iso_fortran_env, only : error_unit
:Imports
error_unit
, a predefined I/O unit for error reporting from the Fortran intrinsic moduleiso_fortran_env
.error_unit
is commonly used in lieu of a hard-coded unit number (e.g., 0 or 2) for consistent error output.
use :: bitmanip_mod
:Brings in public symbols from the
bitmanip_mod
module. This includes the following procedures:naive_count_bits
early_stopping_count_bits
lookup_table_count_bits
kernighan_count_bits
bit_repr
The
bitmanip_mod
also automatically handles initialization of its lookup table for the lookup-table-based bit-counting method.
implicit none
:Prohibits the use of implicitly typed variables, requiring all variables to be declared explicitly.
Improves safety by preventing accidental usage of undeclared variables.
2. Variable Declarations#
integer :: nr_vals
integer(kind=I4), dimension(:), allocatable :: vals
integer(kind=I4) :: result
integer :: i, istat
integer :: naive_count_bits_result, early_stopping_count_bits_result, &
lookup_table_count_bits_result, kernighan_count_bits_result
real :: r, start_time, end_time
nr_vals
: Holds the number of values (integers) to be processed, as read from the command-line argument.vals
(integer(kind=I4), dimension(:), allocatable
):An allocatable array that will store
nr_vals
random integers (kind=I4
is a 32-bit integer type).
result
(integer(kind=I4)
):Will store the cumulative result (via XOR) during benchmarks of the bit-counting routines.
i, istat
(integer
):i
is a loop counter.istat
captures the status code returned by array allocation.
naive_count_bits_result, early_stopping_count_bits_result,
lookup_table_count_bits_result, kernighan_count_bits_result
:Temporary integers to hold the results from each of the four bit-counting methods for a given integer in the sanity check loop.
r, start_time, end_time
(real
):r
: used to store random numbers in the range [0,1).start_time, end_time
: used withcpu_time
intrinsic to measure benchmarking durations.
3. Sanity Check Loop#
print '(A)', 'sanity check: naive, early stopping, lookup table Kernighan'
do i = -16, 16
naive_count_bits_result = naive_count_bits(i)
early_stopping_count_bits_result = early_stopping_count_bits(i)
lookup_table_count_bits_result = lookup_table_count_bits(i)
kernighan_count_bits_result = kernighan_count_bits(i)
print '(I14, 3A, 4I3)', i, ': ', bit_repr(i), ': ', &
naive_count_bits_result, early_stopping_count_bits_result, &
lookup_table_count_bits_result, kernighan_count_bits_result
end do
Purpose:
Verifies that all four bit-counting functions (
naive_count_bits
,early_stopping_count_bits
,lookup_table_count_bits
,kernighan_count_bits
) produce the same result for a range of integers from-16
to16
.
Details:
Prints a header line:
"sanity check: naive, early stopping, lookup table Kernighan"
.Iterates
i
from-16
up to16
.For each integer
i
, obtains its bit counts via each method and also retrieves its binary representation viabit_repr(i)
.Prints the integer, its binary representation, and the four counts.
Note the format string
'(I14, 3A, 4I3)'
:I14
: prints the integeri
in a field of 14 characters (right-aligned).3A
: prints three consecutive strings (in this case,": "
,bit_repr(i)
, and": "
).4I3
: prints the four integer results in fields of width 3 each.
Significance:
This “sanity check” ensures the correctness of the implementations on both positive and negative values (showing how negative values are handled, presumably via two’s complement representation).
4. Creating Values for Benchmarking#
4.1 Reading Command-Line Argument#
call get_arguments(nr_vals)
Calls a local subroutine
get_arguments
(defined in thecontains
section at the end of the program) which:Ensures exactly one command-line argument is present.
Reads it into
nr_vals
.Exits with an error if invalid input is provided.
4.2 Allocating the Array#
allocate (vals(nr_vals), stat=istat)
if (istat /= 0) then
write (unit=error_unit, fmt='(A, I0, A)') 'error: can not allocate ', &
nr_vals, ' elements'
stop 3
end if
Tries to allocate the array
vals
of lengthnr_vals
.Checks if
istat
(the status code from the allocation) is not zero, meaning an allocation failure:If failed, writes an error message to
error_unit
with'error: can not allocate X elements'
.Terminates the program with exit code 3.
4.3 Filling vals
with Random Integers#
do i = 1, size(vals)
call random_number(r)
vals(i) = int(r*real(huge(0_I4)), kind=I4)
end do
Generates a random real number
r
in[0,1)
viacall random_number(r)
.Scales it to the range
[0, huge(0_I4)]
—the largest representable 32-bit integer—by multiplying byr
.Converts to a 32-bit integer via
int(..., kind=I4)
.Stores the result in
vals(i)
.Goal: Populate a large array of random 32-bit integers for performance testing.
5. Benchmarking Loops#
The program times each bit-counting method by:
Recording a start time.
Looping over all values in
vals
, applying the bit-counting method, and XOR-ing results intoresult
.Recording an end time.
Computing
end_time - start_time
to measure the method’s total execution time.
The XOR (result = xor(result, ...)
) is presumably done to prevent the compiler from optimizing away the function calls in a trivial scenario; storing or mixing results ensures each bit-counting call cannot be elided.
5.1 Naive Implementation#
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, naive_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F15.6)', 'naive implementation ', size(vals), &
' iterations: ', end_time - start_time
cpu_time
:Built-in Fortran intrinsic that returns a process-based CPU time in seconds (implementation-dependent).
Captures
start_time
before the loop,end_time
after the loop.
Loop:
Iterates over the entire
vals
array.Calls
naive_count_bits(vals(i))
.XORs the returned bit count with the accumulating
result
.
Prints: The method name, the number of iterations (
size(vals)
), and the total execution time formatted inF15.6
(15 wide, 6 decimal places).
5.2 Early Stopping Implementation#
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, early_stopping_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F15.6)', 'early stopping implementation ', size(vals), &
' iterations: ', end_time - start_time
Same structure as above, but uses
early_stopping_count_bits
.
5.3 Lookup Table Implementation#
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, lookup_table_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F15.6)', 'lookup table implementation ', size(vals), &
' iterations: ', end_time - start_time
Same benchmarking pattern, but with
lookup_table_count_bits
.
5.4 Kernighan’s Implementation#
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, kernighan_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F15.6)', 'Kernighan implementation ', size(vals), &
' iterations: ', end_time - start_time
Same pattern using
kernighan_count_bits
.
6. Contained Subroutine: get_arguments
#
At the end of the file, after the main executable sections, the program uses an internal subroutine to handle command-line arguments.
contains
subroutine get_arguments(nr_vals)
use, intrinsic :: iso_fortran_env, only : error_unit
implicit none
integer, intent(out) :: nr_vals
character(len=1024) :: buffer, msg
integer :: istat
if (command_argument_count() /= 1) then
write (unit=error_unit, fmt='(A)') 'error: expecting number of values'
write (unit=*, fmt='(A)') 'error: expecting number of values'
stop 1
end if
call get_command_argument(1, buffer)
read (buffer, fmt=*, iostat=istat, iomsg=msg) nr_vals
if (istat /= 0) then
write (unit=error_unit, fmt='(2A)') 'error: ', trim(msg)
stop 2
end if
end subroutine get_arguments
end program bitcount
contains
:A special block in Fortran programs or modules that holds internal procedures.
subroutine get_arguments(nr_vals)
:Purpose: Reads exactly one command-line argument (the expected integer
nr_vals
).Behavior:
Checks if
command_argument_count()
(the number of command-line arguments) is exactly 1; if not, writes an error toerror_unit
and stops with code1
.Extracts the argument string into
buffer
viaget_command_argument(1, buffer)
.Attempts to parse
buffer
as an integer intonr_vals
using an unformatted read.If
istat /= 0
, meaning a read error or type mismatch, prints an error with theiomsg
and stops with code2
.
Note:
This approach is typical in Fortran to parse arguments from the command line (when compiled for systems that support it, which is most modern Fortran compilers).
7. High-Level Flow Summary#
Putting it all together:
Initialization:
The program starts and immediately imports the bit manipulation module (
bitmanip_mod
).No explicit module initialization is called here; however, if the user calls a bit-counting method that requires the lookup table,
bitmanip_mod
takes care of loading it on first use.
Sanity Check:
The program iterates from -16 to 16, computing bit counts via each of the four methods.
Prints the integer, its binary form, and the counts. This confirms correctness and demonstrates consistency among methods for small positive/negative integers.
Command-Line Argument:
Reads
nr_vals
, ensuring the user provided one argument (the number of random values to test).
Allocate & Populate Data:
Dynamically allocates an array of 32-bit integers.
Fills it with random values spanning the possible integer range.
Benchmarking:
For each of the four methods:
Record start time via
cpu_time
.Loop over
vals
, compute bit count, XOR withresult
.Record end time.
Print elapsed time.
Exit:
After printing all four benchmarks, the program ends.
8. Notable Considerations#
Negative vs. Positive Values:
The sanity check covers negative values. The bit-counting routines will handle them correctly assuming two’s complement. The
bit_repr
function inbitmanip_mod
displays the internal binary representation, including sign bits for negative numbers.
Performance:
The four approaches (Naive, Early Stopping, Lookup Table, Kernighan) have different performance characteristics. The program’s timing output shows which approach is fastest for a given number of iterations.
result
XOR:The
result = xor(result, ...)
line ensures that the function call is not optimized away. If the loop simply callednaive_count_bits(vals(i))
without using the result, a compiler might skip or inline it, artificially boosting performance.The XOR ensures a dependency on the returned value, making the measurement more realistic.
Scalability:
The user can pass large
nr_vals
to create a substantial workload, thus measuring and comparing the methods over many iterations.
Error Handling:
The program uses
stop
statements with nonzero codes for various error conditions:1
: Wrong number of command-line arguments.2
: Failed to parse integer from the command line.3
: Failure in memory allocation.
Messages are written to
error_unit
to help diagnose problems.
9. Conclusion#
bitcount.f90
is a concise yet powerful driver program that:
Demonstrates correctness for multiple bit-counting algorithms (
naive
,early stopping
,lookup table
,Kernighan
).Benchmarks each method across a user-specified number of randomly generated 32-bit integers.
Leverages the companion module
bitmanip_mod
for all bitwise utilities and the lookup-table feature.
This design makes it easy to compare performance, study different approaches to counting bits, and serve as a template for further exploration of bitwise operations in Fortran.
Detailed Analysis by ChatGPT 4o1: bitmanip_mod
#
Below is a comprehensive, section-by-section analysis of the Fortran module bitmanip_mod
. This module contains bit-counting routines, a mechanism for loading a 256-element lookup table from a CSV file, and several utility routines for string manipulation and bitwise operations. It demonstrates careful management of initialization, file I/O, and the use of public/private scope to organize code effectively.
1. Module Declaration#
module bitmanip_mod
use, intrinsic :: iso_fortran_env, only : INT32, error_unit
use csv_module
implicit none
private
module bitmanip_mod
: Defines a Fortran module namedbitmanip_mod
.use, intrinsic :: iso_fortran_env, only : INT32, error_unit
:INT32
is an integer kind parameter guaranteeing a 32-bit integer (as per the ISO Fortran environment).error_unit
is a special I/O unit typically used for error reporting.
use csv_module
: Imports a third-party or external library providing CSV-related capabilities (presumably the “csv-fortran” library or similar).implicit none
: Disallows implicit typing of variables, ensuring safer and more explicit code.private
: Sets the default accessibility of module entities to private, making them inaccessible to other program units unless explicitly declaredpublic
.
2. Lookup Table and Flags#
! A 256-element lookup table stored in the module
integer, public, parameter :: I4 = INT32
integer(kind=I4), dimension(0:255), private :: lookup_table
logical, private, save :: initialized = .false.
integer, public, parameter :: I4 = INT32
Defines a module parameter
I4
that is publicly visible.I4
is set toINT32
, effectively binding the symbolI4
to the 32-bit integer kind. Other routines useI4
to ensure consistent integer typing.
integer(kind=I4), dimension(0:255), private :: lookup_table
Declares an array
lookup_table
of size 256 (indices 0 to 255
) to store precomputed bit counts or other associated values.Marked
private
, so it cannot be accessed directly from outside the module.
logical, private, save :: initialized = .false.
A logical flag indicating whether the lookup table has already been loaded from disk.
Marked
save
so that its state persists across multiple calls; otherwise, it might not retain its value.
3. Public Interfaces#
! Public procedures
public :: naive_count_bits, early_stopping_count_bits, bit_repr, &
lookup_table_count_bits, kernighan_count_bits
Declares which routines in the module are accessible externally:
naive_count_bits
early_stopping_count_bits
bit_repr
lookup_table_count_bits
kernighan_count_bits
Everything else in the module remains private
, reinforcing modular encapsulation.
4. Subroutines for Lookup Table Initialization#
4.1 ensure_lookup_table_initialized
#
subroutine ensure_lookup_table_initialized()
if (.not. initialized) then
call initialize_lookup_table()
end if
end subroutine ensure_lookup_table_initialized
Purpose: Checks the
initialized
flag. If the lookup table is not yet loaded, it callsinitialize_lookup_table
.Key Detail: Ensures that the following bit-counting routines, which depend on the lookup table, do not proceed until the table is valid and loaded.
4.2 initialize_lookup_table
#
subroutine initialize_lookup_table()
implicit none
integer :: ios
! Clear the module array
lookup_table = 0
! Call parse_csv_file to fill the local array
call parse_csv_file("lookup_table.dat", lookup_table, ios)
if (ios == 0) then
initialized = .true.
else
write(error_unit, '(A,I0)') "Failed to read CSV lookup table, ios=", ios
stop 1
end if
end subroutine initialize_lookup_table
Clears the
lookup_table
array by setting all elements to zero initially.Calls
parse_csv_file("lookup_table.dat", lookup_table, ios)
:Attempts to load the data from
lookup_table.dat
intolookup_table
.ios
serves as an error code to indicate success or failure.
Sets
initialized = .true.
ifios == 0
, indicating successful initialization.Stops execution if there is a failure (non-zero
ios
), printing an error message toerror_unit
.
5. CSV File Parsing: parse_csv_file
#
subroutine parse_csv_file(filename, table_data, ios)
use csv_module
implicit none
! Subroutine arguments
character(len=*), intent(in) :: filename
integer(kind=I4), dimension(0:255), intent(out) :: table_data
integer, intent(out) :: ios
...
end subroutine parse_csv_file
Purpose: Reads a CSV file
filename
, processes it, and populatestable_data
(the 256-element integer array).Arguments:
filename
: Name of the CSV file to read.table_data
: The 256-element array to store the parsed data.ios
: Output integer error code to capture success/failure.
5.1 Local Variables#
type(csv_file) :: c
logical :: status_ok
character(len=255), allocatable, dimension(:,:) :: string_table
integer :: nrow, ncol
integer :: i, j, idx, ios2, val
character(len=30) :: trimmed_string
! For pre-processing:
integer :: unit_in, unit_out, status_in, status_out
character(len=1024) :: line, tmpfile
logical :: done
c
: An instance ofcsv_file
fromcsv_module
.string_table
: A 2D array of strings holding the CSV data read bycsv_module
.nrow
,ncol
: Dimensions of the CSV data.idx
: Tracks how many entries have been written totable_data
.trimmed_string
: Used to trim whitespace or extraneous characters before numeric conversion.unit_in
,unit_out
: File unit numbers for reading from the input CSV and writing to a temporary file.line
,tmpfile
: Buffers for file I/O;tmpfile
is the name of the temporary file to generate.done
: A loop control flag (typical for reading files until EOF).
5.2 Preprocessing Logic#
tmpfile = 'tempfile.csv'
open(newunit=unit_in, file=filename, status='old', action='read', &
form='formatted', iostat=status_in)
Opens the
filename
for reading on unitunit_in
.Creates a new file
tempfile.csv
on unitunit_out
withstatus='replace'
.Reads each line from the original file, calls
remove_trailing_comma_ampersand
to clean trailing, &
, then writes the cleaned line totempfile.csv
.
This approach ensures that any extraneous trailing commas or ampersands in the original file (e.g., lines ending with , &
) are removed before CSV parsing.
5.3 CSV Parsing Using csv_module
#
call c%initialize()
call c%read(tmpfile, status_ok=status_ok)
call c%get(string_table, status_ok=status_ok)
Initialize the
csv_file
object.Read the temporary file
tmpfile.csv
, storing its contents in an internal buffer.Get the data as a 2D string array
string_table
.
5.4 Populating table_data
#
nrow = size(string_table,1)
ncol = size(string_table,2)
idx = 0
do i = 1, nrow
do j = 1, ncol
trimmed_string = adjustl(adjustl(string_table(i,j)))
read(trimmed_string, *) val
table_data(idx) = val
idx = idx + 1
end do
end do
Determines the dimensions (
nrow
,ncol
) of the string data.Iterates over all cells, trimming whitespace.
Reads each cell into an integer
val
.Assigns
val
totable_data(idx)
and incrementsidx
.Since
table_data
is only 256 elements, this routine assumes at most 256 integer values are needed.
6. String Cleanup Subroutine: remove_trailing_comma_ampersand
#
subroutine remove_trailing_comma_ampersand(line)
implicit none
character(len=*), intent(inout) :: line
integer :: n
n = len_trim(line)
...
end subroutine remove_trailing_comma_ampersand
Purpose: Removes trailing
&
and,
from a line. This addresses cases like:42, &
which should become simply
42
Implementation:
Find the trimmed length
n = len_trim(line)
.While the last character is
'&'
, remove it.Then check if the last two characters are
", "
, remove them as well.
This makes the file lines more compatible with standard CSV parsing.
7. Bit-Counting Routines#
Each of these functions is declared public
and returns an integer bit count or a string representation (in the case of bit_repr
). They all call ensure_lookup_table_initialized()
first, although for certain algorithms (like naive or Kernighan), the lookup table is not strictly needed. This ensures consistency and that the module’s data is ready.
7.1 naive_count_bits(n) result(bit_count)
#
function naive_count_bits(n) result(bit_count)
integer(kind=I4), value :: n
integer :: bit_count
integer :: i
call ensure_lookup_table_initialized()
bit_count = 0
do i = 1, 32
bit_count = bit_count + and(n, 1_I4)
n = ishft(n, -1)
end do
end function naive_count_bits
Parameters:
n
: a 32-bit integer (kind=I4), passed by value.bit_count
: the returned result (total number of 1 bits).
Logic:
Loops exactly 32 times for a 32-bit integer.
Checks the least significant bit
and(n, 1_I4)
.Accumulates the bit count.
Shifts
n
right by 1 each iteration.
Performance: Straightforward but not the most efficient.
7.2 early_stopping_count_bits(n) result(bit_count)
#
function early_stopping_count_bits(n) result(bit_count)
integer(kind=I4), value :: n
integer :: bit_count
integer :: i
call ensure_lookup_table_initialized()
bit_count = 0
do i = 1, 32
bit_count = bit_count + and(n, 1_I4)
n = ishft(n, -1)
if (n == 0) exit
end do
end function early_stopping_count_bits
Difference from Naive:
After shifting, if
n
becomes 0, it exits early. This optimization helps whenn
is small or has bits set early in the lower portion.
Performance: Faster for numbers with few high-order bits set, as it avoids unnecessary loop iterations.
7.3 lookup_table_count_bits(n) result(bit_count)
#
function lookup_table_count_bits(n) result(bit_count)
integer(kind=I4), value :: n
integer :: bit_count
call ensure_lookup_table_initialized()
bit_count = lookup_table(and(n, 255_I4)) + &
lookup_table(and(ishft(n, -8), 255_I4)) + &
lookup_table(and(ishft(n, -16), 255_I4)) + &
lookup_table(and(ishft(n, -24), 255_I4))
end function lookup_table_count_bits
Approach:
Splits the 32-bit integer
n
into four 8-bit segments using right shifts and masks:The lowest 8 bits:
and(n, 255_I4)
The next 8 bits:
and(ishft(n, -8), 255_I4)
And so forth up to 24 bits.
Lookup:
Each 8-bit value serves as an index into
lookup_table
, which presumably stores the number of set bits for all numbers from 0 to 255.Summing these four partial counts yields the total set bits in 32 bits.
Performance:
Very efficient, since each 8-bit chunk’s bit count is retrieved in O(1) time.
Requires memory for the table and a one-time file read to initialize.
7.4 kernighan_count_bits(n) result(bit_count)
#
function kernighan_count_bits(n) result(bit_count)
integer(kind=I4), value :: n
integer :: bit_count
call ensure_lookup_table_initialized()
bit_count = 0
do while (n /= 0)
n = and(n, n - 1_I4)
bit_count = bit_count + 1
end do
end function kernighan_count_bits
Brian Kernighan’s Algorithm:
Each iteration clears the lowest set bit of
n
by doingn = and(n, n - 1_I4)
.Increments the count of bits for each iteration.
Performance:
Efficient when
n
has relatively few set bits.The loop runs as many times as there are set bits, rather than the total number of bits.
8. Binary Representation: bit_repr(val) result(repr)
#
function bit_repr(val) result(repr)
integer(kind=I4), value :: val
character(len=32) :: repr
integer :: i
do i = 32, 1, -1
if (and(val, 1_I4) == 1_I4) then
repr(i:i) = '1'
else
repr(i:i) = '0'
end if
val = ishft(val, -1)
end do
end function bit_repr
Purpose: Converts a 32-bit integer into a fixed 32-character binary string.
Implementation:
Iterates from bit position 32 down to 1.
Checks the least significant bit of
val
and setsrepr(i:i)
accordingly to'1'
or'0'
.Shifts
val
right by 1.The result is stored in a 32-character string, where
repr(1:1)
is the leftmost bit andrepr(32:32)
is the rightmost bit.
Sign Bit:
For negative values, the leading bits will be the sign bits (in two’s complement representation). Hence, you’ll often see ones at the higher-order bits for negative integers.
9. Key Observations and Takeaways#
Initialization and Encapsulation:
The approach of calling
ensure_lookup_table_initialized()
inside each function ensures that external callers do not need to worry about whether the table is ready.By marking the table and initialization logic private, the module retains control over how the data is loaded and accessed.
Performance Trade-Offs:
Naive and Early Stopping methods rely on shifting all or some bits in a loop, straightforward but generally slower.
Kernighan method iterates only as many times as there are set bits.
Lookup Table method is typically the fastest for large volumes of data, because it reduces the problem to four table lookups.
Error Handling:
The module checks for I/O issues (
ios
codes) when reading files.If CSV reading fails, it uses
stop
statements to terminate the program with a nonzero exit code.
CSV Preprocessing:
Because the CSV lines might contain trailing
, &
, the subroutineremove_trailing_comma_ampersand
cleans each line. This is a helpful technique to ensure the data is in a standard CSV format when eventually read bycsv_module
.
Memory Usage:
The lookup table is only 256 integers in size, which is negligible in modern contexts. However, the code’s design is scalable and well-structured if someone wanted to extend the concept to 16-bit or 64-bit segmented lookups.
10. Conclusion#
bitmanip_mod.f90
is a well-structured module providing:
Multiple Bit-Counting Implementations:
Naive (simple, guaranteed 32 iterations).
Early Stopping (slightly optimized naive approach).
Lookup Table (fast, relies on precomputed 8-bit partial counts).
Kernighan’s (iterates by the number of set bits).
A Portable CSV-Loading Mechanism that leverages a third-party CSV library and local string-manipulation utilities to populate the lookup table.
Utility Function (
bit_repr
) for integer-to-binary-string conversion.Robust Initialization: The separation of concerns between
ensure_lookup_table_initialized
andinitialize_lookup_table
is clean, ensuring the lookup table is always valid when the public routines are called.
Overall, this module exemplifies good Fortran design practices: careful scoping, explicit initialization, and a range of performance-tuned subroutines for a specific bitwise manipulation problem.
Program Code#
bitcount.f90#
program bitcount
use, intrinsic :: iso_fortran_env, only : error_unit
use :: bitmanip_mod
implicit none
integer :: nr_vals
integer(kind=I4), dimension(:), allocatable :: vals
integer(kind=I4) :: result
integer :: i, istat
integer :: naive_count_bits_result, early_stopping_count_bits_result, &
lookup_table_count_bits_result, kernighan_count_bits_result
real :: r, start_time, end_time
print '(A)', 'sanity check: naive, early stopping, lookup table Kernighan'
do i = -16, 16
naive_count_bits_result = naive_count_bits(i)
early_stopping_count_bits_result = early_stopping_count_bits(i)
lookup_table_count_bits_result = lookup_table_count_bits(i)
kernighan_count_bits_result = kernighan_count_bits(i)
!print '(I14, 3A, 4I3)', i, ': ', bit_repr(i), ': ', &
! naive_count_bits(i), early_stopping_count_bits(i), &
! lookup_table_count_bits(i), kernighan_count_bits(i)
print '(I14, 3A, 4I3)', i, ': ', bit_repr(i), ': ', &
naive_count_bits_result, early_stopping_count_bits_result, &
lookup_table_count_bits_result, kernighan_count_bits_result
end do
! create values upfront for benchmarking
call get_arguments(nr_vals)
allocate (vals(nr_vals), stat=istat)
if (istat /= 0) then
write (unit=error_unit, fmt='(A, I0, A)') 'error: can not allocate ', &
nr_vals, ' elements'
stop 3
end if
do i = 1, size(vals)
call random_number(r)
vals(i) = int(r*real(huge(0_I4)), kind=I4)
end do
! time naive implmentation
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, naive_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F35.15)', 'naive implementation ', size(vals), &
' iterations: ', end_time - start_time
! time early stopping implmentation
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, early_stopping_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F35.15)', 'early stopping implementation ', size(vals), &
' iterations: ', end_time - start_time
! time lookup table implmentation
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, lookup_table_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F35.15)', 'lookup table implementation ', size(vals), &
' iterations: ', end_time - start_time
! time Kernighan's algorithm
call cpu_time(start_time)
do i = 1, size(vals)
result = xor(result, kernighan_count_bits(vals(i)))
end do
call cpu_time(end_time)
print '(A, I0, A, F35.15)', 'Kernighan implementation ', size(vals), &
' iterations: ', end_time - start_time
contains
subroutine get_arguments(nr_vals)
use, intrinsic :: iso_fortran_env, only : error_unit
implicit none
integer, intent(out) :: nr_vals
character(len=1024) :: buffer, msg
integer :: istat
if (command_argument_count() /= 1) then
write (unit=error_unit, fmt='(A)') 'error: expecting number of values'
write (unit=*, fmt='(A)') 'error: expecting number of values'
stop 1
end if
call get_command_argument(1, buffer)
read (buffer, fmt=*, iostat=istat, iomsg=msg) nr_vals
if (istat /= 0) then
write (unit=error_unit, fmt='(2A)') 'error: ', trim(msg)
stop 2
end if
end subroutine get_arguments
end program bitcount
bitmanip_mod.f90#
module bitmanip_mod
use, intrinsic :: iso_fortran_env, only : INT32, error_unit
use csv_module
implicit none
private
! A 256-element lookup table stored in the module
integer, public, parameter :: I4 = INT32
integer(kind=I4), dimension(0:255), private :: lookup_table
logical, private, save :: initialized = .false.
! Public procedures
public :: naive_count_bits, early_stopping_count_bits, bit_repr, &
lookup_table_count_bits, kernighan_count_bits
contains
!----------------------------------------------------------------------
!> Main routine that ensures the module lookup_table is initialized
subroutine ensure_lookup_table_initialized()
if (.not. initialized) then
call initialize_lookup_table()
end if
end subroutine ensure_lookup_table_initialized
!----------------------------------------------------------------------
!> Initialize the module lookup_table by reading 256 integers from the CSV
subroutine initialize_lookup_table()
implicit none
integer :: ios
! Clear the module array
lookup_table = 0
! Call parse_csv_file to fill the local array
call parse_csv_file("lookup_table.dat", lookup_table, ios)
if (ios == 0) then
initialized = .true.
else
write(error_unit, '(A,I0)') "Failed to read CSV lookup table, ios=", ios
stop 1
end if
end subroutine initialize_lookup_table
!----------------------------------------------------------------------
!> Reads up to 256 integers from the CSV file using csv-fortran’s get_csv_data_as_str
subroutine parse_csv_file(filename, table_data, ios)
!! Reads a CSV file that has trailing ", &" on each line,
!! removing them before calling csv-fortran's get_csv_data_as_str.
use csv_module
implicit none
! Subroutine arguments
character(len=*), intent(in) :: filename ! Holds the name of the CSV file
integer(kind=I4), dimension(0:255), intent(out) :: table_data
integer, intent(out) :: ios ! Error code
! Local variables
type(csv_file) :: c ! csv-fortran object
logical :: status_ok ! Status flag for csv-fortran objects
character(len=255), allocatable, dimension(:,:) :: string_table ! Holds the CSV data as a table of strings
integer :: nrow, ncol ! Number of rows and columns in the CSV data
integer :: i, j, idx, ios2, val
character(len=30) :: trimmed_string ! Holds the trimmed string which is a data cell
!! For pre-processing:
integer :: unit_in, unit_out, status_in, status_out
character(len=1024) :: line, tmpfile
logical :: done
! Initialize outputs
ios = 0
table_data = 0
! Generate a temp file name
tmpfile = 'tempfile.csv'
!-------------------------------------------------
! STEP 1: Pre-process 'filename' => 'tempfile.csv'
!-------------------------------------------------
open(newunit=unit_in, file=filename, status='old', action='read', &
form='formatted', iostat=status_in)
if (status_in /= 0) then
write(*,*) "Error opening input file:", filename
ios = 101 ! Error return code for opening input file
return
end if
open(newunit=unit_out, file=tmpfile, status='replace', action='write', &
form='formatted', iostat=status_out)
if (status_out /= 0) then
write(*,*) "Error creating temp file:", tmpfile
ios = 102 ! Error return code for creating temp file
close(unit_in)
return
end if
do
read(unit_in, '(A)', iostat=status_in) line
if (status_in < 0) exit ! EOF encountered
if (status_in /= 0) then
ios = 103 ! Error reading input file
exit
end if
call remove_trailing_comma_ampersand(line)
write(unit_out, '(A)') trim(line)
end do
close(unit_in)
close(unit_out)
if (ios /= 0) return ! Error occurred
!----------------------------------
! STEP 2: Use csv-fortran on tmpfile
!----------------------------------
call c%initialize()
call c%read(tmpfile, status_ok=status_ok)
if (ios /= 0) then
write(*,*) "Error reading temp CSV file:", tmpfile
call c%close(status_ok)
return
end if
call c%get(string_table, status_ok=status_ok)
if (ios /= 0) then
write(*,*) "Error extracting data as strings."
call c%close(status_ok)
return
end if
! Parse up to 256 integers
nrow = size(string_table,1)
ncol = size(string_table,2)
idx = 0
do i = 1, nrow
do j = 1, ncol
trimmed_string = adjustl(adjustl(string_table(i,j)))
read(trimmed_string, *) val
table_data(idx) = val
idx = idx + 1
end do
end do
! Close CSV
call c%close(status_ok)
deallocate(string_table)
end subroutine parse_csv_file
!------------------------------------------
subroutine remove_trailing_comma_ampersand(line)
implicit none
character(len=*), intent(inout) :: line
integer :: n
n = len_trim(line)
if (n < 1) return ! Nothing to do
! Remove trailing "&"
do while (n > 0)
if (line(n:n) == "&") then
line(n:n) = ""
n = n - 1
else
exit
end if
end do
! Remove trailing ", "
do while (n >= 2)
if (line(n-1:n) == ", ") then
line(n-1:n) = ""
n = n - 2
else
exit
end if
end do
end subroutine remove_trailing_comma_ampersand
!----------------------------------------------------------------------
!> Bit-counting routines that rely on the lookup_table
function naive_count_bits(n) result(bit_count)
implicit none
integer(kind=I4), value :: n
integer :: bit_count
integer :: i
call ensure_lookup_table_initialized()
bit_count = 0
do i = 1, 32
bit_count = bit_count + and(n, 1_I4)
n = ishft(n, -1)
end do
end function naive_count_bits
function early_stopping_count_bits(n) result(bit_count)
implicit none
integer(kind=I4), value :: n
integer :: bit_count
integer :: i
call ensure_lookup_table_initialized()
bit_count = 0
do i = 1, 32
bit_count = bit_count + and(n, 1_I4)
n = ishft(n, -1)
if (n == 0) exit
end do
end function early_stopping_count_bits
function lookup_table_count_bits(n) result(bit_count)
implicit none
integer(kind=I4), value :: n
integer :: bit_count
call ensure_lookup_table_initialized()
! The lookup table is now loaded
bit_count = lookup_table(and(n, 255_I4)) + &
lookup_table(and(ishft(n, -8), 255_I4)) + &
lookup_table(and(ishft(n, -16), 255_I4)) + &
lookup_table(and(ishft(n, -24), 255_I4))
end function lookup_table_count_bits
function kernighan_count_bits(n) result(bit_count)
implicit none
integer(kind=I4), value :: n
integer :: bit_count
call ensure_lookup_table_initialized()
bit_count = 0
do while (n /= 0)
n = and(n, n - 1_I4)
bit_count = bit_count + 1
end do
end function kernighan_count_bits
function bit_repr(val) result(repr)
implicit none
integer(kind=I4), value :: val
character(len=32) :: repr
integer :: i
do i = 32, 1, -1
if (and(val, 1_I4) == 1_I4) then
repr(i:i) = '1'
else
repr(i:i) = '0'
end if
val = ishft(val, -1)
end do
end function bit_repr
end module bitmanip_mod
The above program is compiled and run using Fortran Package Manager (fpm). The following FPM configuration file (fpm.toml) was used:
name = "Section_Bit_Manipulations_Bit_Count"
[build]
auto-executables = true
auto-tests = true
auto-examples = true
[install]
library = false
[dependencies]
csv-fortran = {git="https://github.com/jacobwilliams/csv-fortran.git"}
[[executable]]
name="Section_Bit_Manipulations_Bit_Count"
source-dir="app"
main="section_bit_manipulations_bit_count.f90"
Build the Program using FPM (Fortran Package Manager)#
import os
root_dir = ""
root_dir = os.getcwd()
code_dir = root_dir + "/" + "Fortran_Code/Section_Bit_Manipulations_Bit_Count"
os.chdir(code_dir)
build_status = os.system("fpm build 2>/dev/null")
Run the Program using FPM (Fortran Package Manager)#
The program is run and the output is saved into a file named ‘data.dat
exec_status = \
os.system("fpm run -- 1024 2>/dev/null")
sanity check: naive, early stopping, lookup table Kernighan
-16: 11111111111111111111111111110000: 28 28 28 28
-15: 11111111111111111111111111110001: 29 29 29 29
-14: 11111111111111111111111111110010: 29 29 29 29
-13: 11111111111111111111111111110011: 30 30 30 30
-12: 11111111111111111111111111110100: 29 29 29 29
-11: 11111111111111111111111111110101: 30 30 30 30
-10: 11111111111111111111111111110110: 30 30 30 30
-9: 11111111111111111111111111110111: 31 31 31 31
-8: 11111111111111111111111111111000: 29 29 29 29
-7: 11111111111111111111111111111001: 30 30 30 30
-6: 11111111111111111111111111111010: 30 30 30 30
-5: 11111111111111111111111111111011: 31 31 31 31
-4: 11111111111111111111111111111100: 30 30 30 30
-3: 11111111111111111111111111111101: 31 31 31 31
-2: 11111111111111111111111111111110: 31 31 31 31
-1: 11111111111111111111111111111111: 32 32 32 32
0: 00000000000000000000000000000000: 0 0 0 0
1: 00000000000000000000000000000001: 1 1 1 1
2: 00000000000000000000000000000010: 1 1 1 1
3: 00000000000000000000000000000011: 2 2 2 2
4: 00000000000000000000000000000100: 1 1 1 1
5: 00000000000000000000000000000101: 2 2 2 2
6: 00000000000000000000000000000110: 2 2 2 2
7: 00000000000000000000000000000111: 3 3 3 3
8: 00000000000000000000000000001000: 1 1 1 1
9: 00000000000000000000000000001001: 2 2 2 2
10: 00000000000000000000000000001010: 2 2 2 2
11: 00000000000000000000000000001011: 3 3 3 3
12: 00000000000000000000000000001100: 2 2 2 2
13: 00000000000000000000000000001101: 3 3 3 3
14: 00000000000000000000000000001110: 3 3 3 3
15: 00000000000000000000000000001111: 4 4 4 4
16: 00000000000000000000000000010000: 1 1 1 1
naive implementation 1024 iterations: 0.000078999903053
early stopping implementation 1024 iterations: 0.000065000029281
lookup table implementation 1024 iterations: 0.000004000030458
Kernighan implementation 1024 iterations: 0.000035000033677
Program Output Interpretation by ChatGPT 4o1#
1. Sanity Check Results#
1.1 Verification of Four Methods#
From \(-16\) through \(16\), the program prints:
Integer value (e.g., -16, -15, 0, 1, 2, etc.).
32-bit binary representation of the integer (two’s complement for negatives).
Four integers: the bit-count results for each method
Naive (
naive_count_bits
)Early Stopping (
early_stopping_count_bits
)Lookup Table (
lookup_table_count_bits
)Kernighan (
kernighan_count_bits
)
For instance:
-16: 11111111111111111111111111110000: 28 28 28 28
...
16: 00000000000000000000000000010000: 1 1 1 1
Significance:
All methods produce identical counts for each integer, confirming the correctness and consistency of the four implementations.
Negative integers (e.g., \(-1\), \(-2\), etc.) appear in two’s complement form, hence the leading bits are
1
. That explains large bit counts for values like \(-1\) (32 ones → bit count = 32).
1.2 Specific Observations#
\(-1\):
Binary is all ones (111111...1111
), yielding a bit count of 32 for each method.\(-16\):
Binary ends in11110000
. Because \(-16\) has four zeros at the lower end and ones in the high positions, the bit count is 28.Small Positive Values:
For example, \(3\) is000000...0011
with a count of 2 bits.
For \(15\) (00000000000000000000000000001111
), the bit count is 4.
These results ensure that the range from negative to positive integers is correctly handled.
2. Timing and Performance#
After the sanity check, the program runs 1024 iterations of each bit-counting method on some dataset (in this snippet, presumably 1024 integers), printing the elapsed CPU time. Example output:
naive implementation 1024 iterations: 0.000079999910668
early stopping implementation 1024 iterations: 0.000055999960750
lookup table implementation 1024 iterations: 0.000004000030458
Kernighan implementation 1024 iterations: 0.000035000033677
Interpretation:
Naive (0.00008s)
Checks each of 32 bits unconditionally for each integer.
Takes the most time (though still small), indicative of a higher overhead approach.
Early Stopping (0.00006s)
Similar logic to Naive, but exits the loop early if the remaining bits are zero.
Slightly faster than Naive in many cases.
Lookup Table (0.000004s)
Significantly faster because it counts bits by splitting each 32-bit integer into four 8-bit chunks and uses a precomputed table of the bit counts for each byte.
Fast constant time lookups dominate here.
Kernighan (0.000035s)
Uses
n = n & (n - 1)
, clearing one set bit per iteration.Performance depends on the number of set bits, not all 32.
Often quite efficient, though typically not quite as fast as table lookups for a large batch of random data.
Hence, the performance ordering in this particular test shows Lookup Table as the fastest, followed by Kernighan, Early Stopping, and then Naive.
3. Overall Significance#
Correctness Check: The first part (the “sanity check”) assures that for any integer in \([-16,16]\), all four methods return the same bit-count. This validates the integrity of each algorithm.
Performance Comparison: The second part (timing) highlights the efficiency trade-offs between the methods. It illustrates how a lookup table can drastically reduce execution time when counting bits repeatedly, while Kernighan can also be quite efficient depending on the data distribution.
Thus, the output both confirms that each bit-counting approach works correctly and demonstrates the relative performance of each algorithm over 1024 iterations.