Skip to content

Commit

Permalink
Changes to the manual
Browse files Browse the repository at this point in the history
  • Loading branch information
larsvilhuber committed Jun 30, 2024
1 parent 6a2485d commit c89e7c4
Show file tree
Hide file tree
Showing 6 changed files with 192 additions and 176 deletions.
4 changes: 2 additions & 2 deletions 00-targets.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ We want to check
::: {.incremental}

- that your code runs without problem, after all the debugging.
- that your code runs without manual intervention.
- that it actually produces all the outputs
- that your code runs without manual intervention, and with low effort.
- that your code generates a log file that you can inspect, and that you could share with others.
- that it will run on somebody else's computer
- that it actually produces all the outputs

:::
14 changes: 11 additions & 3 deletions 03-automatically_saving_figures.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
Say you have 53 figures and 23 tables, the latter created from 161 different specifications. That makes for a lot of work when re-running the code, if you haven't automated the saving of said figures and tables.


```{warning}
:::{warning}
We have seen instructions that tell the replicator to right-click and save the figures. While there is no substitute for comparing all these figures, that's too much work!
```
:::

## TL;DR

Expand Down Expand Up @@ -71,8 +71,16 @@ ggsave(ggp,file.path(figures,"figure1.png"))

:::{tab-item} Python

There are many ways to do this in Python, which is often geared towards "head-less" processing. Even if using Jupyter notebooks, you should save the figures!

```python
Need example
import matplotlib.pyplot as plt
import os

plt.plot([1, 2, 3], [1, 4, 9])
plt.show()
plt.savefig(os.path.join("figures",'foo.png'))
plt.savefig(os.path.join("figures",'foo.pdf'))
```


Expand Down
69 changes: 69 additions & 0 deletions 04-01-creating_log_files_manually.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
(creating-log-files-explicitly)=
# Creating log files explicitly

We start by describing how to explicitly generate log files as part of the statistical processing code.

::::{tab-set}


:::{tab-item} Stata

```stata
global logdir "${rootdir}/logs"
cap mkdir "$logdir"
local c_date = c(current_date)
local cdate = subinstr("`c_date'", " ", "_", .)
local c_time = c(current_time)
local ctime = subinstr("`c_time'", ":", "_", .)
local globallog = "$logdir/logfile_`cdate'-`ctime'-`c(username)'.log"
log using "`globallog'", name(global) replace text
```

How to potentially do this automatically at each start, see [Stata manual](https://www.stata.com/manuals/gswb.pdf#gswB.3).

:::

:::{tab-item} R

```R
# This will only log output ("stdout") and warnings/messages ("stderr"), but not the commands themselves!

logfile.name <- paste0("logfile_", Sys.Date(),"-",format(as.POSIXct(Sys.time()), format = "%H_%M"),"-",Sys.info()["user"], ".log")
globallog <- file(file.path(rootdir,logfile.name), open = "wt")
# Send output to logfile
sink(globallog, split=TRUE)
sink(globallog, type = "message")

## revert output back to the console
sink(type = "message")
sink()
close(globallog)
```

:::

:::{tab-item} MATLAB

```matlab
% The "diary" function should achieve this. Not a MATLAB expert!
```
:::

:::{tab-item} Python

```python
% The logging module should achieve this.
import logging
logging.warning('Watch out!')
```
will output

```
WARNING:root:Watch out!
```

:::

::::

While some software (Stata, MATLAB) will create log files that contain commands and output, others (R, Python) will create log files that contain only output.
104 changes: 104 additions & 0 deletions 04-02-creating_log_files_automatically.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
(creating-log-files-automatically)=
# Creating log files automatically

An alternative (or complement) to creating log files explicitly is to use native functionality of the software to create them. This usually is triggered when using the **command line** to run the software, and thus may be considered an **advanced topic.** The examples below are for Linux/macOS, but similar functionality exists for Windows.


:::::{tab-set}


::::{tab-item} Stata

To automatically create a log file, run Stata from the command line with the `-b` option:

```bash
stata -b do main.do
```

which will create a file `main.log` in the same directory as `main.do`.

:::{warning}
For this to work, the filename cannot include spaces.
:::

On Windows, follow instructions [here](https://www.stata.com/manuals/gswb.pdf#gswB.5).

::::

::::{tab-item} R

To automatically create a log file, run R from the command line as follows:

```bash
R CMD BATCH main.R
```

will create a file `main.Rout` in the same directory as `main.R`.

:::{warning}
If there are other commands, such as `sink()`, active in the R code, the `main.Rout` file will not contain some output.
:::

::::

::::{tab-item} MATLAB

To automatically create a log file, run MATLAB from the command line as follows:

```bash
matlab -nodisplay -r "addpath(genpath('.')); main" -logfile matlab.log
```

A similar command on Windows would be:

```bash
start matlab -nosplash -minimize -r "addpath(genpath('.'));main" -logfile matlab.log
```

::::

::::{tab-item} Julia, Python

In order to capture screen output in Julia and Python, on Unix-like system (Linux, macOS), the following can be run:

```bash
julia main.jl | tee main.log
```

or

```bash
python main.py | tee main.log
```

which will create a log file with everything that would normally appear on the console using the `tee` command.

::::

:::::

## Takeaways

### What this does

This ensures

- that your code runs without problem, after all the debugging.
- that your code runs without manual intervention.
- that your code generates a log file that you can inspect, and that you could share with others.

### What this does not do

This does not ensure

- that it will run on somebody else's computer
- because it does not guarantee that all the software is there
- because it does not guarantee that all the directories for input or output are there
- because many intermediate files might be present that are not in the replication package
- because it does not guarantee that all the directory names are correctly adjusted everywhere in your code
- that it actually produces all the outputs
- because some outputs might be present from test runs

### What to do next

To solve some of these problems, let's go to the next step.
Loading

0 comments on commit c89e7c4

Please sign in to comment.