rgw@icdattcwsm:~/Blog$ cat 2024-09-27-By-2027-Start-Up-Valuations-Will-Be-AI-Generated.md
By 2027, start up valuations will be based on an AI analysis of their logs. Making everything more transparent. In this post, I show how AI based analytics will kill such high-IQ real world jobs. Although the last part of the code applies generative AI, it is pertinent to mention that even the deterministic methods (part of the rgwml library), here, were AI-generated. A good logs table typically consists of the two classes of columns/ properties: - event_logs: such as the index and timestamp - event_attributes: real world physical qualities associated with the event. For instance - price for transactional events; distance for locomotive events, etc. Let's say we are tracking a marathon training workout regime. Let's get the logs into our dataframe, which we load into an rgwml p object. >>> import rgwml as r >>> d.fd() 1: /home/rgw/Documents/tables/running.csv (Last Modified: 2024-09-27 05:34:38, Size: 0.00 MB) Choose a number corresponding to the file you want to load: 1 DataFrame loaded successfully. id date goal weight distance zone_2_mins zone_3_mins zone_4_mins zone_5_mins 0 1 2024-08-19 Run 5 100m sprints in zone 5, with a 2 km warm... 88.80 6 44 25 6 2 1 2 2024-08-21 Run 6.6km, keeping heart rate in Zone 2 86.95 6.6 36 37 7 0 2 3 2024-08-23 Run 8.8km, keeping heart rate in zone 2 85.95 9 56 28 4 0 3 4 2024-08-24 Run 11km, keeping heart rate in zone 2 87.65 11.5 44 29 19 0 4 5 2024-08-26 Run 5 100m sprints in zone 5, with a 2 km warm... 89.75 6.98 39 15 3 1 5 6 2024-08-29 Run 6.6km, keeping heart rate in Zone 2 88.15 6.16 30 11 1 0 6 7 2024-08-30 Run 8.8km, keeping heart rate in Zone 2 87.75 9.30 45 17 18 2 7 8 2024-08-31 Run 12.1km, keeping heart rate in Zone 2 87.45 10.22 23 40 26 4 8 9 2024-09-02 Run 5 100m sprints in zone 5, with a 2 km warm... 87 6.4 17 28 16 4 9 10 2024-09-04 Run 5.5km, keeping heart rate in Zone 2 87.30 6.6 42 19 6 0 10 11 2024-09-05 Run 6.6km, keeping heart rate in Zone 2 85.75 7.7 33 21 0 0 11 12 2024-09-07 Run 9.9km, keeping heart rate in Zone 2 85.30 9.9 47 28 5 0 12 13 2024-09-09 Run 6 100m sprints in zone 5, with a 2 km warm... 86.50 6.0 44 15 8 2 13 14 2024-09-11 Run 7.7 km, keeping heart rate in Zone 2 86.40 8.2 34 20 4 0 14 15 2024-09-12 Run 8.8 km, keeping heart rate in Zone 3 85.45 8.8 37 25 33 3 15 16 2024-09-14 Run 11.1 km, keeping heart rate in Zone 2 85.30 11.1 47 31 7 0 16 17 2024-09-16 Run 6 100m sprints in zone 5, with a 2 km warm... 86.60 6.6 43 16 11 6 17 18 2024-09-18 Run 7.7 km, keeping heart rate in Zone 2 84.60 7.7 32 22 1 0 18 19 2024-09-19 Run 8.8 km, keeping heart rate in Zone 2 84.45 8.8 40 22 0 0 19 20 2024-09-21 Run 12.1 km, keeping heart rate in Zone 3 84.35 12.1 30 39 42 1 20 21 2024-09-23 Run 6 100m sprints in zone 5, with a 2 km warm... 86.70 8.25 55 17 8 4 21 22 2024-09-25 Run 8.8 km, keeping heart rate in Zone 2 86.05 8.4 33 23 0 0 22 23 2024-09-26 Run 9.9 km, keeping heart rate in Zone 3 86.05 10.4 36 29 30 3 Columns: ['id (object)', 'date (object)', 'goal (object)', 'weight (object)', 'distance (object)', 'zone_2_mins (object)', 'zone_3_mins (object)', 'zone_4_mins (object)', 'zone_5_mins (object)'] We then class upon the .des() method of the rgwml p instance, to make the following computations on numerically parseable data. >>> d.des() as nc acr mcr acr10pct acr3l10pct aacv macv acv mcv accv mccv accv10pct mccv10pct accv3l10pct id 276.00 22.00 1.000000 1.000 0.269014 2.333333 1.000000 1.0 1.000000 1.000 12.500000 12.500 2.200 2.000 46.666667 date 0.00 0.00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN goal 0.00 0.00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN weight 1990.25 -2.75 -0.125000 -0.225 -0.002503 0.250000 0.938636 0.8 -0.125000 -0.225 86.429545 86.450 -0.275 -0.725 199.600000 distance 192.71 4.40 0.200000 1.010 0.046331 -0.233333 2.146364 2.2 0.200000 1.010 8.486818 8.325 0.440 0.310 20.750000 zone_2_mins 887.00 -8.00 -0.363636 -4.500 0.014625 -3.666667 11.545455 10.0 -0.363636 -4.500 38.318182 38.000 -0.800 2.000 89.666667 zone_3_mins 557.00 4.00 0.181818 3.500 0.006075 -0.666667 9.090909 6.5 0.181818 3.500 24.181818 22.500 0.400 1.000 56.000000 zone_4_mins 255.00 24.00 1.090909 -1.500 0.000553 7.666667 12.909091 9.0 1.090909 -1.500 11.318182 7.000 2.400 -1.500 30.666667 zone_5_mins 32.00 1.00 0.045455 0.000 0.004731 1.000000 2.045455 2.0 0.045455 0.000 1.363636 0.000 0.100 0.500 4.666667 Key and Calculation Formulas: as: Aggregate Sum (sum of all values in the column) nc: Net Change (sum of consecutive differences: Σ(diff(column))) acr: Average Change Rate (mean of consecutive differences: mean(diff(column))) mcr: Median Change Rate (median of consecutive differences: median(diff(column))) acr10pct: Average Change Rate Over 10 Percentile Intervals (mean of change rates over 10 percentile intervals) acr3l10pct: Average Change Rate Over Last 3 10-Percentile Intervals (mean of changes over last 3 intervals) aacv: Average Absolute Change Value (mean of absolute differences: mean(abs(diff(column)))) macv: Median Absolute Change Value (median of absolute differences: median(abs(diff(column)))) acv: Average Change Value (mean of consecutive differences: mean(diff(column))) mcv: Median Change Value (median of consecutive differences: median(diff(column))) accv: Average Cumulative Change Value (mean of cumulative differences: mean(diff(cumsum(column)))) mccv: Median Cumulative Change Value (median of cumulative differences: median(diff(cumsum(column)))) accv10pct: Average Cumulative Change Value Over 10 Percentile Intervals mccv10pct: Median Cumulative Change Value Over 10 Percentile Intervals accv3l10pct: Average Cumulative Change Over Last 3 10-Percentile Intervals Finally, we feed this data to a generative AI engine to analyze. Prompt: this is marathon training regime data. analyze it with and draw insights for the athlete GPT-4o Response: 1. **Weight Management:** Slight decrease, review diet if unintended. 2. **Distance:** Increasing well, maintain or ramp up progressively. 3. **Zone 2 (Low Intensity):** Decreasing, ensure sufficient easy run time. 4. **Zone 3 (Moderate Intensity):** Positive increase, good for stamina. 5. **Zone 4 (High Intensity):** Significant increase, great for speed/endurance mix. 6. **Zone 5 (Maximum Effort):** Minimal, should remain controlled to avoid overtraining.
rgw@icdattcwsm:~/Blog$ cd ..