rgw@icdattcwsm:~$ l
rgw@icdattcwsm:~/Blog$ cat 2024-09-27-By-2027-Start-Up-Valuations-Will-Be-AI-Generated.md

By 2027, start up valuations will be based on an AI analysis of their logs. Making everything more transparent. In this post, I show how AI based analytics will kill such high-IQ real world jobs. Although the last part of the code applies generative AI, it is pertinent to mention that even the deterministic methods (part of the rgwml library), here, were AI-generated.

A good logs table typically consists of the two classes of columns/ properties:
- event_logs: such as the index and timestamp
- event_attributes: real world physical qualities associated with the event. For instance - price for transactional events; distance for locomotive events, etc.

Let's say we are tracking a marathon training workout regime. Let's get the logs into our dataframe, which we load into an rgwml p object.

    >>> import rgwml as r
    >>> d.fd()
    1: /home/rgw/Documents/tables/running.csv (Last Modified: 2024-09-27 05:34:38, Size: 0.00 MB)
    Choose a number corresponding to the file you want to load: 1
    DataFrame loaded successfully.
        id        date                                               goal weight distance zone_2_mins zone_3_mins zone_4_mins zone_5_mins
    0    1  2024-08-19  Run 5 100m sprints in zone 5, with a 2 km warm...  88.80        6          44          25           6           2
    1    2  2024-08-21            Run 6.6km, keeping heart rate in Zone 2  86.95      6.6          36          37           7           0
    2    3  2024-08-23            Run 8.8km, keeping heart rate in zone 2  85.95        9          56          28           4           0
    3    4  2024-08-24             Run 11km, keeping heart rate in zone 2  87.65     11.5          44          29          19           0
    4    5  2024-08-26  Run 5 100m sprints in zone 5, with a 2 km warm...  89.75     6.98          39          15           3           1
    5    6  2024-08-29            Run 6.6km, keeping heart rate in Zone 2  88.15     6.16          30          11           1           0
    6    7  2024-08-30            Run 8.8km, keeping heart rate in Zone 2  87.75     9.30          45          17          18           2
    7    8  2024-08-31           Run 12.1km, keeping heart rate in Zone 2  87.45    10.22          23          40          26           4
    8    9  2024-09-02  Run 5 100m sprints in zone 5, with a 2 km warm...     87      6.4          17          28          16           4
    9   10  2024-09-04            Run 5.5km, keeping heart rate in Zone 2  87.30      6.6          42          19           6           0
    10  11  2024-09-05            Run 6.6km, keeping heart rate in Zone 2  85.75      7.7          33          21           0           0
    11  12  2024-09-07            Run 9.9km, keeping heart rate in Zone 2  85.30      9.9          47          28           5           0
    12  13  2024-09-09  Run 6 100m sprints in zone 5, with a 2 km warm...  86.50      6.0          44          15           8           2
    13  14  2024-09-11           Run 7.7 km, keeping heart rate in Zone 2  86.40      8.2          34          20           4           0
    14  15  2024-09-12           Run 8.8 km, keeping heart rate in Zone 3  85.45      8.8          37          25          33           3
    15  16  2024-09-14          Run 11.1 km, keeping heart rate in Zone 2  85.30     11.1          47          31           7           0
    16  17  2024-09-16  Run 6 100m sprints in zone 5, with a 2 km warm...  86.60      6.6          43          16          11           6
    17  18  2024-09-18           Run 7.7 km, keeping heart rate in Zone 2  84.60      7.7          32          22           1           0
    18  19  2024-09-19           Run 8.8 km, keeping heart rate in Zone 2  84.45      8.8          40          22           0           0
    19  20  2024-09-21          Run 12.1 km, keeping heart rate in Zone 3  84.35     12.1          30          39          42           1
    20  21  2024-09-23  Run 6 100m sprints in zone 5, with a 2 km warm...  86.70     8.25          55          17           8           4
    21  22  2024-09-25           Run 8.8 km, keeping heart rate in Zone 2  86.05      8.4          33          23           0           0
    22  23  2024-09-26           Run 9.9 km, keeping heart rate in Zone 3  86.05     10.4          36          29          30           3
    Columns: ['id (object)', 'date (object)', 'goal (object)', 'weight (object)', 'distance (object)', 'zone_2_mins (object)', 'zone_3_mins (object)', 'zone_4_mins (object)', 'zone_5_mins (object)']

We then class upon the .des() method of the rgwml p instance, to make the following computations on numerically parseable data.

    >>> d.des()
                      as     nc       acr    mcr  acr10pct  acr3l10pct       aacv  macv       acv    mcv       accv    mccv  accv10pct  mccv10pct  accv3l10pct
    id            276.00  22.00  1.000000  1.000  0.269014    2.333333   1.000000   1.0  1.000000  1.000  12.500000  12.500      2.200      2.000    46.666667
    date            0.00   0.00       NaN    NaN       NaN         NaN        NaN   NaN       NaN    NaN        NaN     NaN        NaN        NaN          NaN
    goal            0.00   0.00       NaN    NaN       NaN         NaN        NaN   NaN       NaN    NaN        NaN     NaN        NaN        NaN          NaN
    weight       1990.25  -2.75 -0.125000 -0.225 -0.002503    0.250000   0.938636   0.8 -0.125000 -0.225  86.429545  86.450     -0.275     -0.725   199.600000
    distance      192.71   4.40  0.200000  1.010  0.046331   -0.233333   2.146364   2.2  0.200000  1.010   8.486818   8.325      0.440      0.310    20.750000
    zone_2_mins   887.00  -8.00 -0.363636 -4.500  0.014625   -3.666667  11.545455  10.0 -0.363636 -4.500  38.318182  38.000     -0.800      2.000    89.666667
    zone_3_mins   557.00   4.00  0.181818  3.500  0.006075   -0.666667   9.090909   6.5  0.181818  3.500  24.181818  22.500      0.400      1.000    56.000000
    zone_4_mins   255.00  24.00  1.090909 -1.500  0.000553    7.666667  12.909091   9.0  1.090909 -1.500  11.318182   7.000      2.400     -1.500    30.666667
    zone_5_mins    32.00   1.00  0.045455  0.000  0.004731    1.000000   2.045455   2.0  0.045455  0.000   1.363636   0.000      0.100      0.500     4.666667

    Key and Calculation Formulas:
      as: Aggregate Sum (sum of all values in the column)
      nc: Net Change (sum of consecutive differences: Σ(diff(column)))
      acr: Average Change Rate (mean of consecutive differences: mean(diff(column)))
      mcr: Median Change Rate (median of consecutive differences: median(diff(column)))
      acr10pct: Average Change Rate Over 10 Percentile Intervals (mean of change rates over 10 percentile intervals)
      acr3l10pct: Average Change Rate Over Last 3 10-Percentile Intervals (mean of changes over last 3 intervals)
      aacv: Average Absolute Change Value (mean of absolute differences: mean(abs(diff(column))))
      macv: Median Absolute Change Value (median of absolute differences: median(abs(diff(column))))
      acv: Average Change Value (mean of consecutive differences: mean(diff(column)))
      mcv: Median Change Value (median of consecutive differences: median(diff(column)))
      accv: Average Cumulative Change Value (mean of cumulative differences: mean(diff(cumsum(column))))
      mccv: Median Cumulative Change Value (median of cumulative differences: median(diff(cumsum(column))))
      accv10pct: Average Cumulative Change Value Over 10 Percentile Intervals
      mccv10pct: Median Cumulative Change Value Over 10 Percentile Intervals
      accv3l10pct: Average Cumulative Change Over Last 3 10-Percentile Intervals

Finally, we feed this data to a generative AI engine to analyze.

    Prompt: this is marathon training regime data. analyze it with and draw insights for the athlete

    GPT-4o Response:

    1. **Weight Management:** Slight decrease, review diet if unintended.
    2. **Distance:** Increasing well, maintain or ramp up progressively.
    3. **Zone 2 (Low Intensity):** Decreasing, ensure sufficient easy run time.
    4. **Zone 3 (Moderate Intensity):** Positive increase, good for stamina.
    5. **Zone 4 (High Intensity):** Significant increase, great for speed/endurance mix.
    6. **Zone 5 (Maximum Effort):** Minimal, should remain controlled to avoid overtraining.

rgw@icdattcwsm:~/Blog$ cd ..