2
\$\begingroup\$

I've started teaching myself c/c++ with the intent of being able to writing python bindings to c code. I have a c library used for astronomical calculations given a date lat lon and some other parameters.

In my implementation, I use a date range to run a start stop step for loop over the astronomical calcs (I don't actually call them).

src/core.h

#ifndef CORE_H_  // include guard
#define CORE_H_

#ifdef __cplusplus

extern "C" {
#endif

int** array_from_date_range(unsigned int* start_day,
                            unsigned int* end_day,
                            unsigned int interval);

unsigned int julian_day(int year, int month, int day);

#ifdef __cplusplus
}
#endif

#endif  // CORE_H_

src/core.c

#include "core.h"
#include <stdlib.h>
#include "array.h"

const int MONTHS_IN_YEAR = 12;
const double AVG_DAYS_PER_MONTH = 30.6001;
const double AVG_DAYS_PER_YEAR = 365.25;
const double GREGORIAN_CALENDAR =
    1582 + 10 / MONTHS_IN_YEAR + 15 / AVG_DAYS_PER_YEAR;

/*calculate julian day*/
unsigned int julian_day(int year, int month, int day) {
  double jd, gc_cor, margin;

  if (month < 3) {
    year -= 1;
    month += MONTHS_IN_YEAR;
  }

  gc_cor = 0;
  // date is greater than the GERGORIAN_CALENDAR value apply a correction
  if ((year + month / 12 + day / AVG_DAYS_PER_YEAR) >= GREGORIAN_CALENDAR) {
    margin = (double)((int)(year / 100));
    gc_cor = 2 - margin + (double)((int)(margin / 4));
  }
  //
  if (year >= 0) {
    jd = (double)((int)(AVG_DAYS_PER_YEAR * year)) +
         (double)((int)(AVG_DAYS_PER_MONTH * (month + 1.0))) + day + 1720994.5 +
         gc_cor;
  } else {
    jd = (double)((int)(AVG_DAYS_PER_YEAR * year - 0.75)) +
         (double)((int)(AVG_DAYS_PER_MONTH * (month + 1.0))) + day + 1720994.5 +
         gc_cor;
  }
  return jd;
}

int** array_from_date_range(unsigned int* start_day,
                            unsigned int* end_day,
                            unsigned int interval) {
  unsigned int jd0, jd1, jdx;
  jd0 = julian_day(start_day[0], start_day[1], start_day[2]);
  jd1 = julian_day(end_day[0], end_day[1], end_day[2]);
  if(!interval)interval=1;
  int n = (int)floor((jd1 - jd0)/interval);
  int** arr = allocate_array(n, 5);
  int count=0, i ;
  for (i = 0; i < n; i++) {
    jdx = jd0 +count;
    arr[i][0] = jdx;  // int function(jdx, *pyargs)
    arr[i][1] = jdx;  // int function(jdx, *pyargs)
    arr[i][2] = jdx;  // int function(jdx, *pyargs)
    arr[i][3] = jdx;  // int function(jdx, *pyargs)
    arr[i][4] = jdx;  // int function(jdx, *pyargs)
    count = i + interval;
  }
  return arr;
  free_array(arr, n);
}

The // int function(jdx, *pyargs) is just a placeholder my intent is to return a 2d array into pandas dataframe. Something along these lines...

pd.DataFrame(array_from_date_range((2022,1,1),(2022,1,15),0,**kwargs),columns=['jdate',...]).set_index('jdate')`

src/array.h

#ifndef ARRAY_H_  // include guard
#define ARRAY_H_

#ifdef __cplusplus
/**/
extern "C" {
#endif

int** allocate_array(int Rows, int Cols);
void free_array(int** board, int Rows);

#ifdef __cplusplus
}
#endif

#endif  // ARRAY_H_

src/array.c

#include <stdlib.h>

int** allocate_array(int nRows, int nCols) {
  // allocate Rows rows, each row is a pointer to int
  int** arr = (int**)malloc(nRows * sizeof(int*));
  int row;

  // for each row allocate Cols ints
  for (row = 0; row < nRows; row++) {
    arr[row] = (int*)malloc(nCols * sizeof(int));
  }

  return arr;
}

void free_array(int** arr, int nRows) {
  int row;
  // first free each row
  for (row = 0; row < nRows; row++) {
    free(arr[row]);
  }
  // Eventually free the memory of the pointers to the rows
  free(arr);
}

I am using a cmake and google test

CMakeLists.txt

cmake_minimum_required(VERSION 3.5)
#
project(cproject)
set(CMAKE_CXX_STANDARD 14)
# ##############################################################################
# Python
# ##############################################################################
find_package(Python 3.10 REQUIRED Interpreter Development NumPy)
find_package(PythonLibs REQUIRED)
# ##############################################################################
# Source lib
# ##############################################################################
add_library(core SHARED src/core.c src/array.c)
target_link_libraries(core PUBLIC Python::NumPy)
# ##############################################################################
# GTest
# ##############################################################################
enable_testing()
find_package(GTest CONFIG REQUIRED)
add_executable(
  core_test
  src/tests/core_test.cc
  src/tests/array_test.cc
)
target_link_libraries(core_test PUBLIC core ${PYTHON_LIBRARIES} GTest::gtest GTest::gtest_main )

add_test(UnitTests core_test)
# ##############################################################################
# Installation
# ##############################################################################
set(CMAKE_INSTALL_PREFIX /usr/local/src)
install(TARGETS core DESTINATION ${CMAKE_INSTALL_PREFIX})

src/tests/core_tests.cc

#include "../core.h"
#include <Python.h>
#include <limits.h>
#include "gtest/gtest.h"


namespace CORE {

TEST(Array2D, ArrTest) {
  unsigned int start_day[3] = {2022, 1, 1};
  unsigned int end_day[3] = {2022, 1, 15};
  int** arr = array_from_date_range(start_day, end_day, 1);
  ASSERT_EQ(arr[0][0], 2459580);
  ASSERT_EQ(arr[1][0], 2459581);

  int** arr2 = array_from_date_range(start_day, end_day, 2);
  ASSERT_EQ(arr2[0][0], 2459580);
  ASSERT_EQ(arr2[1][0], 2459582);
}

TEST(JulianDay, JdTest) {
  unsigned int jd = julian_day(2022, 1, 1);
  ASSERT_EQ(jd, 2459580);
}

}  // namespace CORE

edit:

I caught an error with my step

\$\endgroup\$

2 Answers 2

5
\$\begingroup\$

General Observations

Caveats, I am a C and C++ programmer, I don't know python or Google tests at all. I also don't know much about astronomy or converting Gregorian dates to Julian dates.

While some compilers accept the .cc file extension for C++, a more common file extension for C++ is .cpp.

The tests should be written in 2 languages, neither one is C++. The unit tests should be written in C and integration tests should be written in python. There is no reason to use C++.

The tests are not complete, unit tests should include negative path testing as well as positive path testing. Negative path testing is where you force errors into the unit test to see what the code does, it is necessary for writing robust that won't fail when users do the wrong thing. Testing arrays of lengths 1 or 2 are also not enough, I would use at least a year, and possibly a decade as a test unit. This will also allow you to do performance testing.

Compiler Warnings

I haven't compiled the code however, there should be a warning message for the function array_from_date_range(). The function free_array() will never be called because it is after the return statement. In both C and C++ the return statement indicates that there will be no more processing in that function.

I suggest that when you compile C or C++ source code you use the -Wall switch to catch all possible errors in the code.

Memory Leaks

This code is going to leak memory since the arrays are never deallocated as mentioned above in the Compiler Warnings section. You also don't want to delete the allocated memory before you use it.

Test for Possible Memory Allocation Errors

In modern high-level languages such as C++, memory allocation errors throw an exception that the programmer can catch. This is not the case in the C programming language. While it is rare in modern computers because there is so much memory, memory allocation can fail, especially if the code is working in a limited memory application such as embedded control systems. In the C programming language when memory allocation fails, the functions malloc(), calloc() and realloc() return NULL. Referencing any memory address through a NULL pointer results in undefined behavior (UB).

Possible unknown behavior in this case can be a memory page error (in Unix this would be call Segmentation Violation), corrupted data in the program and in very old computers it could even cause the computer to reboot (corruption of the stack pointer).

To prevent this undefined behavior a best practice is to always follow the memory allocation statement with a test that the pointer that was returned is not NULL.

Convention When Using Memory Allocation in C

When using malloc(), calloc() or realloc() in C a common convention is to sizeof(*PTR) rather sizeof(PTR_TYPE), this make the code easier to maintain and less error prone, since less editing is required if the type of the pointer changes.

\$\endgroup\$
5
\$\begingroup\$

More documentation

Functions in src/core.h deserve details for their valid range of year, month, day and Julian day.

Somewhere an explanation or reference of julian_day is deserved.

In particular, what is the year before 1 (1 AD)? Is it 0 or -1 (1 BC)?

Range

Julian day deserves to be wider than int as int may only be 16-bit. Consider long or int32_t, etc.

Precession

Julian day is fundamentally 0.5 off from a calendar date. Julian day 0.0 is noon of Monday, January 1, 4713 BC, proleptic Julian calendar.

Converting dates (year, month, day) to/from Julian day encounter "off-by 0.5" due to this.

An alternative is to use Modified JD which reckons from midnight and not noon.

Correctness

AVG_DAYS_PER_YEAR = 365.25 looks amiss as it is wrong for years since 1582. AVG_DAYS_PER_YEAR = 365.2425 expected. Yet code has cryptic margin = (double)((int)(year / 100)); which might account for this.

In any case, AVG_DAYS_PER_YEAR better as AVG_DAYS_PER_JULIAN_YEAR.

Credit

I suspect OP lifted the formula from someone's prior work - if so, a reference to that algorithm is a must.

Unnamed constants

Constants like 1720994.5 deserve to be named like:

// Some appropriate name.
#define Days_From_4716BC_To_AD1Jan1 1720994.5 

Other naked constants include: 3, 2, 4, 0.75.

Name Space

Why is core defining array_from_date_range, julian_day? A more uniform naming is deserved.

e.g. astro_core.h, astro_jd_array_from_date_range, astro_julian_day.

Why cast to int?

(double)((int)(year / 100)) is unnecessarily restrictive to the int range. Equivalent code is trunc(year / 100) and certainly less complex.

Edge cases

I have doubts that code is functionally correct near edge dates like Jan 1, 1 AD and March 1, 1 AD. A typical source of error is lifting the algorithm that needs mod and using truncate which is what casting to int does.

Perhaps supply the source of the algorithm?

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.