UsageEdit

Place the following snippet on the wiki page, replacing function_name with the name of the function to be used, and giving the parameters separated with |. Specific parameters, the ones with a specific name, like avgs, are assigned with the = sign: |avgs=2,3,4|.

For all the examples shown below, the file Data:COVID-19 cases in Asturias.tab is going to be used.

{{#invoke:Sandbox/Ajuanca|function_name|param1|param2|param name = param value}}

Incidence proportion: get_incidenceEdit

This function calculates the incidence proportion of a specific event for all the available dates.

I.e.: The daily confirmed cases of flu per 10.000 inhabitants in Asturias.

ParametersEdit

The possible parameters are:

  • src= or the first param: The name of the Tabular file. These files are hosted on Wikimedia Commons, inside the Data namespace ("Data:name of tabular file.tab"). The files in this section must end with the suffix ("Data: name of tabular file.tab"). I.e. The file itself, "COVID-19 cases in Asturias.tab"
  • column_name= or the second param: The name of the column where the data of the event is stored. I.e. The column with the total confirmed cases of COVID-19 in Asturias, called "total_cases".
  • date_name= or the third param: The name of the column where the date of the data of the event is stored. I.e. The column with the date (in a String format YYYY-MM-DD), called "date".
  • inhabitants= or the fourth param: The population size of the given region. I.e. 1,018,775 total inhabitants living in Asturias, according to the INE as a temporarly data for January 1st, 2020.
  • nth= or the fifth param: The power of 10 for the generated output (the power of ten to the nth, 10n). This value normally will be 4 or 5, althought can be changed for smaller or bigger population sizes. I.e. 5, which causes the output to be given per 105 inhabitants (100,000).
  • graph= param: A Boolean (true or false) specifiying if a graph should be generated as an output. Default is false. I.e. true (showing a graph as an output).
  • ltable= param: A Boolean (true or false) specifiying if the Lua variable should be generated as an output. Default is false. I.e. false (no returning the variable itself). This is a extrange value to be requested.
  • wtable= param: A Boolean (true or false) specifiying if a Wikitable table should be generated as an output. Default is false. I.e. false (no returning a wikitable). This options is still NOT AVAILABLE.


Note 1: Instead of the column_name parameter, column_title can be used intead. Alike, date_title can be given. It is unusual, but two or more columns could have the same title (but not the same name), so this approach is less safe than column_name and date_name.


Note 2: If graph=true is requested, all parameters available at the Module:Graph can be given. They will be passed to this moduled without any kind of modification.


As a recap, the snippet to include will look like this:

Template:((#invoke:Sandbox/Ajuanca|get_all_incidence|src|colum_name|date_name|inhabitants|nth|graph=trueTemplate:))

ExamplesEdit

Let's say we want a graph of the Data:COVID-19 cases in Asturias.tab Tabular file, showing the daily incidence of the total COVID-19 confirmed cases (which is registered in the "total_cases" column), expressed per 100.000 inhabitants, and assuming a total population of 1,018,775. Also note that the date is written in a format compatible with the Template:Graph (in the "date" row).


In this case, we will type:

Template:((#invoke:Sandbox/Ajuanca|get_all_incidence|COVID-19 cases in Asturias.tab|total_cases|date|1018775|5|graph=trueTemplate:))

which will output:

Script error: The function "get_all_incidence" does not exist.


But this is an awful graph. remmeber that we can pass any Module:Graph parameter. In this case, we should specify that the x-values are dates. We can give a width and height, and other line width and color. A legend is also available.

With Template:((#invoke:Sandbox/Ajuanca|get_all_incidence|COVID-19 cases in Asturias.tab|total_cases|date|1018775|5|graph=true|height=300|width=800|colors=#2c72ff|linewidth=2|xType=date|legend=Legend|y1Title=Confirmed cases per 100.,000 inhabitants|title=COVID-19 cases in AsturiasTemplate:))

we get:

Script error: The function "get_all_incidence" does not exist.

Various incidence proportions: get_incidencesEdit

This function calculates various incidence proportions of the given events for all the available dates. It can also output the medium of a specified interval, instead of all daily values.

I.e.: The week average of the daily confirmed cases of flu per 10.000 inhabitants in Asturias.

ParametersEdit

The possible parameters are:

  • src= or the first param: The name of the Tabular file. These files are hosted on Wikimedia Commons, inside the Data namespace ("Data:name of tabular file.tab"). The files in this section must end with the suffix ("Data: name of tabular file.tab"). I.e. The file itself, "COVID-19 cases in Asturias.tab"
  • column_names= or the second param: The name of all the columns where the data of the events are stored. Each name should be separated by a comma (,). I.e. The column with the total confirmed cases of COVID-19 in Asturias, and the column with the total recovered people, called "total_cases and total_recovered" respectively.
  • date_name= or the third param: The name of the column where the date of the data of the event is stored. I.e. The column with the date (in a String format YYYY-MM-DD), called "date".
  • inhabitants= or the fourth param: The population size of the given region. I.e. 1,018,775 (1018775) total inhabitants living in Asturias, according to the INE as a temporarly data for January 1st, 2020.
  • nth= or the fifth param: The power of 10 for the generated output (the power of ten to the nth, 10n). This value normally will be 4 or 5, althought can be changed for smaller or bigger population sizes. I.e. 5, which causes the output to be given per 105 inhabitants (100,000).
  • avgs= param: A list of values that specify by how many items the average should be done. The number of avgs should be equal to the number ot column_names given, and they're also given with a , beetween each element. I.e. Calculate the average for two days and three days: 2,3.
  • graph= param: A Boolean (true or false) specifiying if a graph should be generated as an output. Default is false. I.e. true (showing a graph as an output).
  • ltable= param: A Boolean (true or false) specifiying if the Lua variable should be generated as an output. Default is false. I.e. false (no returning the variable itself). This is a extrange value to be requested.
  • wtable= param: A Boolean (true or false) specifiying if a Wikitable table should be generated as an output. Default is false. I.e. false (no returning a wikitable). This options is still NOT AVAILABLE.


Note 2: If graph=true is requested, all parameters available at the Module:Graph can be given. They will be passed to this moduled without any kind of modification.


As a recap, the snippet to include will look like this:

Template:((#invoke:Sandbox/Ajuanca|get_incidences|src|colum_names|date_name|inhabitants|nth|avgs|graph=trueTemplate:))

ExamplesEdit

Let's say we want a graph of the Data:COVID-19 cases in Asturias.tab Tabular file, showing the daily incidence of the total COVID-19 confirmed cases (which is registered in the "total_cases" column), expressed per 100,000 inhabitants, and assuming a total population of 1,018,775. We will request an average per two days. In the same plot, the recovered cases, also expressed per 100,000, with an average of one week.

In this case, we will type:

Template:((#invoke:Sandbox/Ajuanca|get_incidences|COVID-19 cases in Asturias.tab|total_cases,total_recovered|date|1018775|5|avgs=2,7|graph=trueTemplate:))

which will output:

Script error: The function "get_incidences" does not exist.


--[[
		------------------
	---| EPIDEMICS MODULE |---
		------------------
	~~ A specialized module to work with epidemics data. ~~
	
	In order to use it:
	- Data should be stored in Wikimedia Commons, as a Tabular file.
	  ie: "COVID-19 cases in Asturias.tab"
	- The module is added to any wiki with: 
	  {{#invoke:Module:Sandbox/Ajuanca/GraphIt|param1|param2|...}}
	- All the functions that don't begin with an underscore are thought 
	  to be invoked. The remaining functions are "internal" functions.
	  In case you face some problems with the functions, ask me on my talk
	  page (User_talk:Ajuanca)
	- Make sure you give the correct params. All functions include an 
	  explication. If a graph is requested, all available parameters at 
	  Module:Graph can be passed.
	  
	Feel free to leave any comment, suggestion or complaint on my 
	discussion page (User_talk:Ajuanca).

	ToDo list:
	[ ] Generate wikitable
	[ ] Add positive rate

	Some ideas that maybe are implemented:
	* Divide functions (graph, wikitable) instead of booleans?? 
	* Join functions (get_avg_incidence + get_all_incidence = get_incidences)
	* Get rid of internal join_tables function.
]]--
local p = {}
mgraph = require("Module:Graph")

-- Join two tables.
-- Number index are added over the first table.
-- Other type of keys are added "as they are".
function p.join_tables(_table1, _table2)
	for k, arg in pairs(_table2) do
		if not tonumber(k) then
			_table1[k] = arg
		else
			table.insert(_table1, arg)
		end
	end
	return _table1
end

-- Converts table data type to String.
-- Keys should be int numbers.
-- Values are concatenated with ", "
function p.table2string(_table)
	original_table = _table
	wrapped = ""
	for i=1, #original_table do
		wrapped = wrapped .. original_table[i] .. ", "
	end
	return wrapped:sub(0, -3)
end

-- Graph the given data.
-- All Moduule:Graph parameters are given.
function p._graph(args)
	local ret =  mgraph.chart {args=args}
	local graph = mw.getCurrentFrame():extensionTag('graph', ret)
	return graph
end

function p._get_avg_incidence(args)
	local incidences = args.incidences or args[1]
	local dates = args.dates or args[2]
	local avg_period = args.period or 3
	local periods_avg = {}
	local periods_dates = {}
	local period_cases = {}
	for i, sincidence in ipairs(incidences) do
		if period_cases == nil then
			
		else
			table.insert(period_cases, sincidence)
			if #period_cases == avg_period then
				local total = 0
		    	for i = 1, #period_cases do
		        	total = total + tonumber(period_cases[i])
		    	end
		    	table.insert(periods_avg, total/#period_cases)
		    	period_cases = {}
		    end
		end
	end
	for i, sdate in ipairs(dates) do
		if i%avg_period == 0 then
			table.insert(periods_dates, sdate)
		end
	end
	return {periods_avg, periods_dates}
end 

-- Get an average for the incidence proportion of a specific event.
-- ie. The week average of the daily confirmed cases of flu.
-- The given parameters are:
-- [1] or src: The tabular data, ie: "example.tab"
-- [2] or column_name: The name of the column.
-- [3] or date_name: The name of the date column.
-- [4] or inhabitants: The population size of the given region.
-- [5] or nth: The power of 10 in which the result is given.
-- column_title: The column title of the data to work with.
-- date_title: The column title of the date.
-- period: The number of values to perform the average with. 
-- 		   Default is 3 (ie: 3 days).
function p.get_avg_incidence(frame)
	return_graph = frame.args.graph == "true"
	return_table = frame.args.ltable == "true"
	return_wikitable = frame.args.wtable == "true"
	all_incidence = p.join_tables(p._get_all_incidence(frame.args), frame.args)
	avg_incidence = p._get_avg_incidence(all_incidence)
	to_return = {}
	if return_table then
		table.insert(to_return, avg_incidence)
	end
	avg_incidence = p.join_tables(avg_incidence, frame.args)
	if return_graph then
		avg_incidence.x = p.table2string(table.remove(avg_incidence, 2))
		avg_incidence.y = p.table2string(table.remove(avg_incidence, 1))
		table.insert(to_return, p._graph(avg_incidence))
	end
	if return_wikitable then
		mw.log("in progress")
	end
	if #to_return == 1 then
		to_return = to_return[1]
	end
	return to_return
end

function p._get_all_incidence(args)
	local data_page = args.src or args[1]
	local data = mw.ext.data.get(data_page)
	local column_name = args.column_name or args[2]
	local date_name = args.date_name or args[3]
	local total_residents = tonumber(args.inhabitants) or tonumber(args[4])
	local n = tonumber(args.nth) or tonumber(args[5])
	local column_title = args.column_title or nil
	local date_title = args.date_title or nil
	local ci = nil
	local di = nil
	local column_values = {}
	local date_values = {}
	for j, field in ipairs(data.schema.fields) do
		if field.name == column_name or field.title == column_title then
			ci = j
		elseif field.name == date_name or field.title == date_title then
			di = j
		end
		if ci and di then
			break
		end
	end
	for j, record in ipairs(data.data) do
		value = tonumber(record[ci])
		if value == nil then
			row_value = nil
		else
			row_value = (value*10^tonumber(n))/tonumber(total_residents)
		end
		table.insert(column_values, row_value)
		table.insert(date_values, record[di])
	end
	return {column_values, date_values}
end

-- Get the incidence proportion of a specific event for all the available dates.
-- Def: "Number of new cases of disease during specified time interval"
-- ie. The daily confirmed cases of flu per 10.000 inhabitants.
-- The given parameters are:
-- [1] or src: The tabular data, ie: "example.tab"
-- [2] or column_name: The name of the column.
-- [3] or date_name: The name of the date column.
-- [4] or inhabitants: The population size of the given region.
-- [5] or nth: The power of 10 in which the result is given.
-- column_title: The column title of the data to work with.
-- date_title: The column title of the date.
function p.get_all_incidence(frame)
	all_incidence = p._get_all_incidence(frame.args)
	return_graph = frame.args.graph == "true"
	return_table = frame.args.ltable == "true"
	return_wikitable = frame.args.wtable == "true"
	to_return = {}
	if return_table then
		table.insert(to_return, all_incidence)
	end
	all_incidence = p.join_tables(all_incidence, frame.args)
	if return_graph then
		all_incidence.x = p.table2string(table.remove(all_incidence, 2))
		all_incidence.y = p.table2string(table.remove(all_incidence, 1))
		table.insert(to_return, p._graph(all_incidence))
	end
	if return_wikitable then
		mw.log("in progress")
	end
	if #to_return == 1 then
		to_return = to_return[1]
	end
	return to_return
end

function p.string2table(_string)
	stable = {}
	for value in _string:gmatch("[^,]+") do 
		table.insert(stable, value) 
	end
	return stable
end

-- Get the incidence proportion of a specific event for the specific intervals.
-- Various events and averages can be given. If no averages are specified,
-- daily info is given.
-- ie. The daily confirmed cases of flu per 10.000 inhabitants and the week
-- average of hospital occupation due to flu per 10.000 inhabitants.
-- The given parameters are:
-- [1] or src: The tabular data, ie: "example.tab"
-- [2] or column_names: The names of the columns for the different events.
-- [3] or date_name: The name of the columnthat contains the date information.
-- [4] or inhabitants: The population size of the given region.
-- [5] or nth: The power of 10 in which the result is given.
-- avgs: The averages of the events to be calculated. 
--       ie: 3 (days)
function p.get_incidences(frame)
	columns = nil
	incidences = {}
	dates = {}
	avgs = nil
	if frame.args.column_names then
		columns = p.string2table(frame.args.column_names)
	else
		columns = p.string2table(table.remove(frame.args, 2))
	end
	if frame.args.avgs then
		avgs = p.string2table(frame.args.avgs)
	end
	if not avgs then
		for i, incidence in ipairs(columns) do
			frame.args.column_name = incidence
			incidence = p._get_all_incidence(frame.args)
			table.insert(incidences, incidence[1])
			table.insert(dates, incidence[2])
		end
	else
		for i, incidence in ipairs(columns) do
			frame.args.column_name = incidence
			local all_incidence = p._get_all_incidence(frame.args)
			local all_args = p.join_tables(all_incidence, frame.args)
			all_args.period = tonumber(avgs[i])
			incidence = p._get_avg_incidence(all_args)
			table.insert(incidences, incidence[1])
			table.insert(dates, incidence[2])
		end
	end
	return_graph = frame.args.graph == "true"
	return_table = frame.args.ltable == "true"
	return_wikitable = frame.args.wtable == "true"
	to_return = {}
	if return_table then
		table.insert(to_return, {incidences, dates})
	end
	if return_graph or true then
		to_graph = {}
		for i, incidence in ipairs(incidences) do
			key = "y" .. tostring(i)
			to_graph[key] = p.table2string(incidence)
			-- xnth values are not inserted due to Module:Graph limitations.
		end
		to_graph.x = p.table2string(dates[1])
		to_graph = p.join_tables(to_graph, frame.args)
		table.insert(to_return, p._graph(to_graph))
	end
	if return_wikitable then
		mw.log("in progress")
	end
	if #to_return == 1 then
		to_return = to_return[1]
	end
	return to_return
end

return p