2024-03-29 - GoogleSheets

INDEX

2024-01-22 13:30 - Google Sheets

So I've done a lot of google sheets work in the past and I just need a place to dump info
First of all, in sheets "A3:A" means "all of column A from A3"
Most of what I do uses array formula so something only has to be written once

######################################
TIME
######################################
	#####################################
	## Want to default a column as the current time?
	#####################################
	# B=start-time, C=end-time
	={"Clock OUT_"; 
	  MAP(
	    $B$2:$B,
	    $C$2:$C,
	    LAMBDA(
	  	   a, b, 
	  	   IFS(
	  	     a="", "" , # If start time is empty, end time is empty
	  	     b="", TIMEVALUE(NOW()), # Elif end time is empty, use current time
	  	     1=1, b # Else (A and B are non-empty) use value in B
	)))}

	# ={"...";} means the first line in an array is "..." - so {a,b,c;d,e,f} makes a 3x2 grid

	# MAP() can be used instead of ARRAYFORMULA to map array values to functions
	# MAP>ARRAYFORMULA since the latter has random things not work (like IFS) and = compare
	# But MAP only works if all the arrays passed have the same dimensions

	# LAMBDA(a,b,...) lets you write an arbitrary function - useful for complex problems
	# 1=1 is a good "else" statement for IFS as IFS is just condition-value pairs

	#####################################
	## Want to find the TIME difference between 2 timestamps? Need some conditions
	#####################################
	# B=start-time, D=end-time
	={"Duration";
	  MAP(
	    $B$2:$B,
	    $D$2:$D,
	    LAMBDA(
	      a, b,
	      IFS(
	        b="", "",
	        b<=a, ABS(a-b-1),
	        1=1, b-a
	)))}

	# The case where end<=start is like 10pm->2am (22:00->02:00)
	# Do ABS(22:00 - 02:00 - 24:00) = ABS(-4:00) = 4:00

	#####################################
	## Want to extract the start/end times of a list of times associated to a list of dates?
	#####################################
	# Timesheet A=dates, B=start-times, D=end-times
	={"Start";
	  ARRAYFORMULA(
	    XLOOKUP(
	      UNIQUE($A3:$A), # Extract from a list of dates the unique values
	      Timesheet!A2:A, # Find which row each date is in
	      Timesheet!B2:B, # Where the dates match, find the associated time
	      "",,1 # Sort in start-end (use -1 to sort reverse for finding end-times)
	))}

	# So A2:A is a list of dates - you clock in (B2:B) and out (D2:D)
	# XLOOKUP (with ARRAYFORMULA) scans the date list for items matching A3:A
	# A3:A is wrapped in "unique" just so I don't have to scan empty values

	# If XLOOKUP finds a match it grabs the associated start/end time (Timesheet!B2:B or D2:D)

	#####################################
	## How can you sum all the individual start-end times to find the active time?
	#####################################
	={"Active";
	  ARRAYFORMULA(
	    SUMIF( # Essentially this is a loop, summing dates (in Y) which equal a given value (X)
	      Timesheet!$A2:$A, # Get all the rows and index them by date (list out Y)
	      FILTER( # Loop through local list of dates (set X for each loop)
	        $A3:$A,
	        $A3:$A<>"" # Show only non-"" items
	      ),
	      Timesheet!$E2:$E # For rows that match, return the "active time" column
	    ) # So makes an array SUM(date=A); SUM(date=B), SUM(date=C), ... 
	)}

#####################################
STRINGS
#####################################
	#####################################
	## Want to check if a substring exists in an array of text?
	#####################################
	={"Ex";
	  ARRAYFORMULA(
	    REGEXMATCH(
	      $G$2:$G, # For each item in this array, test this regex
	      "^.*(?i)Exercise.*" # Checks if string contains "Exercise" (case insensitive)
	))}
	
	# Returns an array of TRUE/FALSE - can add checkboxes for a clear view


	#####################################
	## What if you want to see a summary of if a given day has a condition met?
	#####################################
	# First use that substring command and have a column for "this time range met this condition"
	# Then scan through that column and filter for a specific date
	={"Ex";
	  ARRAYFORMULA(
	    COUNTIF(
	      INDEX(
	        FILTER( # Returns an array of rows where the value in column H is true
	          Timesheet!$A:$I,
	          Timesheet!H:H=TRUE
	        ),,1), # Get the first column of this array
	      $A$3:$A # Count how many items in that column match this local list of dates
	    )>0 # TRUE if it's more than 0 (so there is a row for this date where condition is met)
	)}

	# So this has set Timesheet!H as the column where this condition is listed
	# We filter for rows where the condition is met and use INDEX to extract the first column
	# Then simply COUNTIF that condition has been met for a given list of dates
	# If that count is more than 0, this condition has been met

#####################################
CALCULATION
#####################################
	#####################################
	## How can you calculate a rolling average in an array formula?
	#####################################
	={"Roll Avg";
	  TRANSPOSE( # Because it outputs horizontally, rotate it
	    MAKEARRAY(
	      1, COUNTA($E$3:$E), # Make an array 1xN where N=size of this list of timestamps
	      LAMBDA( # For each value in that array, do this calculation
	        x, y,
	        AVERAGE( # Get the average of whatever array is inside
	          INDIRECT( # Construct an array address and return what is in that array
	            ADDRESS(2+y,5)& # Sets the start position (e.g. y=2 then address=4,5=D5)
	            ":"&
	            ADDRESS(2+$F$1+y,5) # Sets the end position (using F1=length)
	)))))}

	# So this just generates a list of arrays to make rolling averages of and the prints them

	#####################################
	## How can you conditionally sum an array of values?
	#####################################
	# H->L = "Energy fat carbs fibre protein", M=count
	# So this is taking a grid of foods and their nutrients and summing based on consumption count
	={"Current",
	  BYCOL( # Loop through this array by column
	    $H$6:$L,
	    LAMBDA(
	      colx,
	      ARRAYFORMULA(
	        SUM( # Add all values in the column together (get summary of, say, protein count)
	          IF(
	            $M$6:$M>0, # Check if this condition is met
	            $M$6:$M*colx, # If condition met, multiply each item in column by the count
	            0 # If condition not met, ignore
	)))))}


	#####################################
	## How can you look up info about an array of items and conditionally combine them
	#####################################
	# There is a grid of meals ("ingredient+mass") and a table of "ingredients-nutrients"
	# This command combines those and returns a summary of nutrients for a given meal
	# The challenge is doing this in a single array command, for an arbitrary size of either grid
	=MAP(
	  {1,2,3,4,5}, # 5 nutrients so loop through each nutrient
	  LAMBDA(
	    nutrient,
	    BYROW( # Loop through rows (each row is a separate meal)
	      ARRAYFORMULA( # Loop for each item in the row (each item is a separate ingredient)
	        IFERROR( # If can't lookup ingredient, value=0
	          XLOOKUP(
	            CHOOSECOLS($C3:$L,1,3,5,7,9), # Extract every odd column (ingredients)
	            Nutrition!$A$27:$A, # Reference to list of all ingredients
	            CHOOSECOLS(Nutrition!B$27:X,nutrient) # Return nutrition for that ingredient
	          ),
	        0)*
	        CHOOSECOLS($C3:$L,2,4,6,8,10)/100), # Multiply this by the mass used in the meal
	      LAMBDA(
	        xrow,
	        SUM(xrow) # Sum each ingredient's nutrient contents
	))))

	# This is a very complex command, which conceptually runs backwards with many layers to it
	# But that's the nature of using arrays like this



2024-03-29 00:27 - Google Sheets

Another key note is that the formatting of a string can affect how it is sorted/processed
Be careful if you have a list of values with numbers and strings, even with the same values
You need to format the columns to make them work the same

Also "TO_TEXT" stops FALSE and TRUE from working within checkboxes
This is because checkboxes use =TRUE and =FALSE variables, instead of strings

######################################
How can I convert an N dimensional hierarchy into 2d rows and columns?
######################################
	# Column G = values like "1", "3 1 1" and "3 1" (e.g. "3 1 2" = Task 3 part 1b)
	# We have 3 result columns: Task N part Nn -> TaskN - TaskN-N - TaskN-Nn

	#####################################
	## Extract the column position
	#####################################
	# We want to preserve space so a subtask uses the same line as its parent, if possible
	=MAP($G$3:$G$1000, LAMBDA(val, # Loop through hierarchy vals ("3 1 1") as "val"
	  LAMBDA(val_parent,val_len,IFS(
	   val="","", # Ignore empty values
	   LEN(val)=1,0, # If length of hierarchy is 1 (e.g. "task 1") then set position=0
	   COUNTIF(
	    $G$3:$G$1000,val_parent)=0, # Check if parent ID ("3 1") exists
	    val_len-1, # If it does not exist, set position lower (avoids some unparented items)
	   1=1,val_len)) # For anything else, use the calculated column position from num of spaces
	  (LEFT(val, FIND(" ", val, LEN(val)-1)), # val_parent = "3 1 1"->"3 1"
	   LEN(val)-LEN(SUBSTITUTE(val," ","")) # val_len = len(3 1 1)-len(311) = num of spaces
	)))
	# The key feature of this is passing functions into LAMBDA to clean up some calculations
	# That and using MAP to iterate through lines in an array - could equally use "BY_ROW"
	# This outputs an array like "0 1 1 2 0 0 1 2" - ordered and staircase-like

	#####################################
	## Extract the row position
	#####################################
	# This was harder to do - need to sort the list of tasks then cumulatively find the positions
	# Column K = extracted column positions
	=ARRAYFORMULA(SCAN(0, # SCAN takes an array and gives cumulative results of a calculation
	 IF(
	  $K$3:$K$1000<>"", # Ignore empty items (if values not equal to "")
	  $K$2:$K$1000>=$K$3:$K$1000,""), # Output an array of TRUE/FALSE if values are increasing 
	 LAMBDA(cum,val,
	  IFS(
	   val,cum+1, # If TRUE, output incremented value
	   val="","", # If empty, output nothing
	   1=1,cum # Anything else (FALSE), output (unchanged) cumulative value 
	))))
	# So you get a position that increases only if the column value stays the same or goes down
	# Column input ".. 0 1 1 2 0" gives output "1 1 2 2 3"
	# This allows you to preserve space while moving down whenever necessary
	# Also requires that the input data is fully sorted - or things become inefficient/unsorted

	#####################################
	## Convert row-col values into coordinates
	#####################################
	=ARRAYFORMULA(MAP(
	  $K$3:$K$1000, # K = "COLS" (column positions)
	  $L$3:$L$1000, # L = "ROWS" (row positions)
	  LAMBDA(cols,rows,
	   IF(cols<>"", # Ignore empty data (where no column position given)
	   ((cols*2)+1)&","&((rows*3)-2) # Do linear operations to stretch coords
	   ,"")
	)))
	# This one is fairly simple compared to the rest - just extracts rows from arrays
	# Also uses DATA&" "&DATA to output variables as strings

	#####################################
	## Generate a table using this coordinates and a set of data
	#####################################
	# This is a bit messy but works enough for the use case. Here is the table:
	# |STATUS TITLE|STATUS TITLE|STATUS TITLE|
	# |ID     DESC |ID     DESC |ID     DESC |
	# |-      DATE |-      DATE |-      DATE |
	# |------------|------------|------------|
	# |STATUS|TITLE|STATUS|TITLE|STATUS|TITLE|
	# ... (REPEATS for further items)
	# So you have a 100x6 (ROWxCOL) column grid (which is made of Nx3 2x3 grids)
	# This makes translating to and from the grids somewhat awkward
	# The idea is generate a grid and then lookup based on positions backwards to find the data
	# These grid coordinates line up with the ones generated before as "COL,ROW"
	=MAKEARRAY(100,6, LAMBDA(row,col, # Make an array of 100 rows and 6 columns
	  LAMBDA(coords,state,id,task,desc,deadline, # Save values as convenient variables
	   IFERROR(XLOOKUP(col&","&row,coords,state), # 0x0 -> lookup "state" TRUE/FALSE
	   # You do "row-1" as you're comparing the current position to a reference coordinate
	   # That reference coordinate is the "state" value - in the top left
	   IFERROR(XLOOKUP(col&","&row-1,coords,id), # 1x0 -> "coords" (reference coordinate)
	   IFERROR(XLOOKUP(col-1&","&row,coords,task), # 0x1 -> "task" (title of task)
	   IFERROR(XLOOKUP(col-1&","&row-1,coords,desc), # 1x1 -> "desc" (description of task)
	   IFERROR(TEXT(
	    XLOOKUP(col-1&","&row-2,coords,deadline), # 2x1 -> "deadline" (date of task)
	    "YYYY-MM-DD"), # Set format explicitly to avoid any further issues
	   IF(COUNTIF(coords,col&","&row-2)>0,"-","") # 2x0 -> just empty (but needs a value)
	))))))
	# Save array ranges as variables - e.g. "M3:M1000" is "coords" - corresponds to LAMBDA func
	($M$3:$M$1000,$F$3:$F$1000,$G$3:$G$1000,$H$3:$H$1000,$I$3:$I$1000,$J$3:$J$1000)))
	# This is independent of the data order but requires no values have the same coordinates
	# I would assume if they do you would just get the first item (depending on XLOOKUP defaults)

	#####################################
	## Extract the data table back from the formatted table
	#####################################
	# This is somewhat easier than it sounds as extracting data just requires it is consistent
	# You extract the STATUS-ID bit and the TITLE-DESC bit seperately then recombine
	# Everything must be filtered somewhat the same for it to line up right
	={ # Create an array (combine arrays together in 1 command)
	 CHOOSECOLS(
	  WRAPROWS(VSTACK( # Stack these columns onto each other and then wrap it into a table
	   FILTER($T$2:$T1000,$T$2:$T1000<>""), # Get sub-grid 1 but only non-empty items
	   FILTER($V$2:$V1000,$V$2:$V1000<>""),
	   FILTER($X$2:$X1000,$X$2:$X1000<>"")),
	  3), # Set it to wrap at every 3rd item (as there are 3 sub-grids)
	 1,2), # Get only columns 1 and 2 - ignore the "-" item - gives you "STATUS ID" table
	 WRAPROWS(VSTACK( # Similarly stack columns and wrap into a table
	  FILTER($U$2:$U1000,$U$2:$U1000<>""),
	  FILTER($W$2:$W1000,$W$2:$W1000<>""),
	  FILTER($Y$2:$Y1000,$Y$2:$Y1000<>"")),
	 3) # Wrap on every 3rd - gives "TITLE DESC DATE"
	 } # As you've treated this as 1 array it gives a table: "STATUS ID TITLE DESC DATE"

######################################
How can I generate a sequence of tasks using just 1 cell to describe the data structure?
######################################
	# For the sake of simplicity I combined 2 formulas into 1 - to run arbitrarily
	# This all works using a single array formula - which means it can lag rather a bit

	#####################################
	## Generate a 3d set of sequences
	#####################################
	# "seq:French (duo) S2:5:3 3 3 4 3 3" = sequence starting with "French (duo) S2"
	# This sequence goes from S2-5 (1,2,end) to S2-8 (1,2,3,end) S2-10 (1,2,end)
	# So you need to generate an array for each unit sized based on how many sections it has
	=IF(INDEX(SPLIT(A$2,":"),1,1)="seq", # If text between start and 1st : is "seq" then ...
	 TOCOL(LAMBDA(inp,
	  MAP( # Pass in arrays as variables
	   SEQUENCE( # inpPos = sequence of A->B ("5:5 3 3 3" = 4 items from 5 = 5->9)
	    LEN(inp),1, # Generate an array of N rows, 1 column
	    INDEX(SPLIT(A$2,":"),1,3)), # Get the starting position (e.g. "5")
	   # SEQUENCE is just a very specific kind of MAKEARRAY
	   # I can't tell why I've done it this way - may be requirements on the length of arrays
	   MAKEARRAY( # inpLen = splits "5 3 3 3" into an array of 5,3,3,3
	    LEN(inp), 1,
	    LAMBDA(a, b, INDEX(SPLIT(inp," "),1,a))), # Splits string by spaces
	   LAMBDA(inpPos, inpLen, # Pass in these arrays as arguments
	    MAKEARRAY(1, inpLen, LAMBDA(x,y, INDEX( # For each 5,3,3,3 pass in array 1->5, 1->3 etc.
	     # Output an array of strings for each of these sequences
	     SPLIT(A$2,":"),1,2)&"-"&inpPos&" "& # Output "French (duo) S2-N " (with N=1, N=2 ...)
	     IF(y-inpLen=0,"end","pt "&y) # Output "pt 1", "pt 2" etc. till final value "end"
	     # When y=inpLen the current item (e.g. "S2-N1 pt 4") is the last one
	  ))))) (INDEX(SPLIT(A$2,":"),1,4)) # Pass in data array "5 3 3 3" as "inp"
	 ,3), # Convert result to a single columns with TOCOL - ignore whitespace and errors (3)

	#####################################
	## Use a single cell of data to generate lists of progress tasks
	#####################################
	# "-seqloop:Intuition Pumps,s,11,1:Money Machine,p,200,50:Random Walks,s,15,1"
	# Now instead of generating sequences you loop through arbitrary items in a single cell
	# This sequence gives a list like "Intuition Pumps s1->s11" but for arbitrary list of books
	IF(INDEX(SPLIT(A$2,":"),1,1)="seqloop", # If text similarly starts with "seqloop"
	 TOCOL(
	  BYROW( # Iterate by row in a table (but first generate the table)
	   MAKEARRAY( # A,a,a,a:B,b,b,b -> (A,a,a,a),(B,b,b,b) as table ("TITLE TYPE LEN STEP")
	    LEN(A$2)-LEN(SUBSTITUTE(A$2,":","")),1, # Get number of items in the list
	    LAMBDA(pos,y,INDEX(SPLIT(A$2,":"),1,pos+1))), # Generate a 2d array of items by colons
	   LAMBDA(book, # Loop through each book (row in the table)
	    LAMBDA(title,type,end,step,
	     ARRAYFORMULA( # For each book outputs "title: p1", "title: p2" etc.
	      title&": "&type& # Output "title: p" (as in title page1)
	      SEQUENCE(1,end/step,step,step) # Output sequence with steps (e.g. 50,100,150 etc.)
	     ))
	    (INDEX(SPLIT(book,","),1,1), # Separate variables by commas - pass each into lambda
	     INDEX(SPLIT(book,","),1,2),
	     INDEX(SPLIT(book,","),1,3),
	     INDEX(SPLIT(book,","),1,4))
	   )),3) # Join into 1 columns and ignore whitespace/errors
	, "")) # For anything else (not "seq" or "seqloop"), ignore