Please login or register.

Documentation and cleanup of source code

What

Clean up and properly document (doxygen) the monero source code, and handle any merges necessary to do so seamlessly with other concurrent development efforts.

Who

I'm Thomas Winget, and I've been contributing to the Monero codebase since around June 2014. My largest contribution is the migration from storing the blockchain in a static, binary format to storing the blockchain using a database API, as well as two BlockchainDB implementations (LMDB and BerkeleyDB). Other contributions of mine can be found via the network graph on github or simply looking at my fork.

As a side effect of the work I've done already on Monero, I'm rather familiar with most of the codebase. This puts me in a good position to write down how everything works and fits together, as I've had to sort that out already for many parts.

As a side effect of this effort, I will further cement the knowledge I have of the codebase and gain knowledge in areas I'm less familiar with, enabling me to efficiently implement new features in the future and guide others who may have questions regarding a specific module. In addition, documenting the code will function as a sort of code review, allowing the possibility of spotting any bugs (minor or otherwise).

Why

In my time developing on Monero, there have been countless instances where I need to reference what another piece of code does in order to work with it, and having all the code documented properly in an easily accessible manner (doxygen) would be helpful for that for anyone working in Monero's codebase. I have also come across areas where the code could be made more efficient or clear without changing any functionality, as well as many typographical errors (though these are minor, I think we can all agree they're annoying if nothing else).

Documentation is by far not a glamorous task (probably why it's a bit sparse in Monero), but every development effort benefits immensely from properly documented code.

Proposal and Milestones

I would like to spend about 20 hours per week working on the above, at a rate of 45 XMR per hour (just over $20/hr USD at the time of writing this), paid out every week. I am unsure whether or not 80 hours of work will be enough to cover the entire codebase. There are over 70,000 lines of source in the src/ folder alone, and there is code in other folders that brings the total to over 100k, so this is no small task!

Milestones would be as follows:

  • First 20 hours (900/3600 XMR)
  • Second 20 hours (1800/3600 XMR)
  • Third 20 hours (2700/3600 XMR)
  • Fourth 20 hours (3600/3600 XMR)

After each milestone (and during), I will be available for comment, so anything that seems unclear in my documentation can be addressed and dealt with accordingly.

Replies: 37
palexander posted 8 years ago Replies: 1 | Weight: 0 | Link [ - ]

Hi there Thomas, just a quick question concerning the status of this project? Is it done and how did it turn out? I feel this is actually a very important project for the future of Monero and I was just curious on the status. Also, is it possible to post an example of what you are doing, so interested people can get an idea of what is taking place?

Reply to: palexander
tewinget posted 8 years ago Weight: 0 | Link [ - ]

For examples of what I'm doing, look no further than the other commits thus far on this task. :)

Not done yet, but my goal is to wrap up the final 20 hours (and possibly more, to compensate some for the increase in XMR's value) in the next couple of days.

tewinget edited 8 years ago Weight: 0 | Link [ - ]

Ok, final 20 hours complete at long last. Really a bit longer, but I'm still not anywhere near "done". I just likely won't attempt to do so much at a time again... >_>

Much of this time was spent rebasing/cherry-picking the previous milestones' commits, which is why above I said "really a bit longer", as that didn't quite seem fair. I'll be doing more yet, but as I'm over 20 hours by a bit and just wrapped up a section, I thought it's time for an update. New developments follow below.


Transaction pool documentation:

commit c6bb201a07d84e195be901acda2fcd0b886bf66f
Author: Thomas Winget 
Date:   Thu Mar 24 20:03:02 2016 -0400

    Transaction pool documentation (and some cleanup)

    tx_pool.h doxygen documentation completed.
    Many notes made on areas for improvement, be that functionality or
    code clarity.
    Commented code and unused code removed.

As the pull request for the previous milestones is still pending, I have not yet submitted this as a pull request. The changes can be found here on my github.


Testnet documentation:

This documentation isn't something that will go in the source code directly, and thus not on the github. I went through every instance of "testnet" in the source to evaluate what "testnet or not" means. I could probably make the resulting document look nicer, but it should be clear enough. While this is not directly documenting the source code, it will lead to better and ideally self-documenting code. For now, I'll simply paste the document below, but if anyone has any questions I'll be glad to answer them. I've also uploaded it to fpaste because the preview shows that it's ugly on here.

For reading it:

The organization is source lines followed by their purpose, followed by a line of hyphens, repeat for all purposes of the notion of "testnet". Source lines which are grouped together without a line between are directly related to one another. The exception to this is the "pass the testnet flag around" section, as it didn't seem useful to group them in this manner.

EDIT: The formatting the preview showed broke into fragile pieces, apparently, so just look at it on fpaste.

tewinget posted 8 years ago Weight: -59 | Link [ - ]

Okay, long delay since the last milestone update, but here's #3:

BlockchainDB's documentation is now Doxygen-compliant and up-to-date with recent changes. A couple of things were renamed, as their purpose had changed but their names implied the old purpose.

It's amazing how much things change in under a year. BlockchainDB has undergone many changes, so the new documentation is not guaranteed to be 100% correct. I'll be attaching my notes to the end of this post, some of them refer to bits I still need to ask about (or spend time looking into). I'll also paste the commit logs at the end, because why not.

If anyone has thoughts on my notes below, please don't hesitate, as many of them I'm not 100% sure what the best course of action is at this time.

As always, you can follow updates on my github repo.




Current TODO list

General:

Discuss updating function names to better reflect their purpose.

Fix inconsistent indentation.

evaluate return values of all functions (lots of bool return values not being used, etc.) related to this, consider throwing rather than returning empty containers where a container is the return type

src/

cryptonote_core/

cryptonote_core.*/

  move BlockchainDB initialization and DB-specific code from cryptonote_core.* to better places

blockchain.*/

  evaluate locking mechanism and possibility of separate reader/writer locks

blockchain_db/

blockchain_db.*/

  remove_transaction_data() should not remove outputs as well, but currently
  subclasses do so.  This should be reorganized.  The current subclass function
  to do so is remove_tx_outputs()

  global output index counting should maybe be handled in the parent class,
  up for debate.

  Need to make sure that aborting a partial add_block after an exception
  will not error out part way through due to another exception.  This could
  be tricky, as "remove_*" might fail because it hadn't been added yet, or
  it might fail because of a database issue.  This will likely require more
  fine-grained exceptions.  For now, with the exceptions that exist, I think
  that a generic DB_ERROR is suitable for, well, a db error, and the DNE
  exceptions should be used as such.  This will likely be sufficient to
  differentiate.

  add_transaction (concrete in parent) takes a pointer to a tx hash as a
  parameter.  This should be a reference, but perhaps there's a reason it
  is not.  Needs looking into.

  Destructor: should it call sync() and close() if open?

  open: db_flags should either be handled in subclass implementations
  or standardized among all blockchaindbs.

  batch transactions: batch_stop and batch_commit both have commit code,
  batch_stop should call to batch_commit instead.

  IMPORTANT: make sure BlockchainDB (and/or subclasses) handles correctly
  removing partial block data in the event of a failure during a block add.

  get_blocks_range: should this throw if the request is out of range, or
  should it let get_block_from_height throw and just ignore it as it does
  now?

  get_tx_list: much the same as above with get_blocks_range.

  *important* hardfork functions: verify that documented functionality is true/correct



** Git log **

commit cc3394b38d87785aafe850e58dc2d79cd65d47de
Author: Thomas Winget 
Date:   Tue Dec 1 17:38:22 2015 -0500

    Update BlockchainDB documentation

    BlockchainDB is now Doxygen-compliant and its documentation is
    up-to-date with recent changes.

 src/blockchain_db/blockchain_db.h | 1063 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 911 insertions(+), 152 deletions(-)

commit 204ed981b0e114dc122de16d8642ec44745074f1
Author: Thomas Winget 
Date:   Tue Dec 1 15:03:46 2015 -0500

    Rename function for clarity

    As BlockchainDB::get_output_key was changed to return a struct containing
    the output key along with two other pieces of metadata for the output, the
    function name is now changed to be get_output_data.

    In addition, the database handles in db_bdb and db_lmdb were renamed
    from m_output_keys to m_output_data to mirror this.

 src/blockchain_db/berkeleydb/db_bdb.cpp      | 28 ++++++++++++++--------------
 src/blockchain_db/berkeleydb/db_bdb.h        |  8 ++++----
 src/blockchain_db/blockchain_db.h            |  8 ++++----
 src/blockchain_db/lmdb/db_lmdb.cpp           | 22 +++++++++++-----------
 src/blockchain_db/lmdb/db_lmdb.h             |  8 ++++----
 src/blockchain_utilities/blockchain_dump.cpp |  2 +-
 src/cryptonote_core/blockchain.cpp           | 10 +++++-----
 7 files changed, 43 insertions(+), 43 deletions(-)
tewinget posted 8 years ago Replies: 1 | Weight: -60 | Link [ - ]

I'd like to thank everyone for their patience with me, as I've been very slow to complete this task. I will admit that a large part of that is motivation, as it is a rather boring task. That said, another part of it is frustration. There have been a lot of great changes in the source over the past year, and I have not kept up with even half of them (other than reading commit logs). As I have worked to document some of these changes, I have seen changes which, while I agree with what they do, I disagree with how they've been done (from an organization/maintainability standpoint). I admit 100% that this is an issue of pride, and I will work past it.

As I work to complete this contract, I am taking notes of the things I disagree with. Obviously mine is but one opinion, so I will not change any of these things for this contract (except obvious things like replacing BOOST_FOREACH with C++11 for, etc.). After I've completed this contract I will review my notes and see if I can come up with solutions, then submit them for review.

Again, thank you all (especially those who donated to this effort, but also to the community at-large) for your patience. I intend to spend the whole day working on more documentation, so look for an update at or before 00:00 UTC Wednesday morning.

tl;dr - look for next milestone update at or before the coming midnight UTC.

Reply to: tewinget
Gingeropolous posted 8 years ago Weight: -60 | Link [ - ]

thanks for sticking with it man!

dEBRUYNE posted 8 years ago Replies: 1 | Weight: -119 | Link [ - ]

How is it going? Also, was the 4x20 hours meant as consecutive? Or just arbitrary. I am looking forward to hearing some new ideas as well btw :)

Reply to: dEBRUYNE
tewinget posted 8 years ago Replies: 1 | Weight: -115 | Link [ - ]

It was initially meant to be consecutive or close to. I've had some other things come up, but I look to finish out the remaining 40 hours within a week or so.

Reply to: tewinget dEBRUYNE
dEBRUYNE posted 8 years ago Weight: -111 | Link [ - ]

Thanks for the update!

tewinget edited 8 years ago Replies: 1 | Weight: -167 | Link [ - ]

Update #2:

Second 20 hours completed. Progress can be seen on the branch on my fork.

Rather than retype all of the commit messages here, you'll find a paste of "git log --stat" below, or at least hopefully you will, this preview thing hates me.

Please feel free to have a look through the generated Doxygen and let me know if anything seems unclear or incorrect (including Doxygen I haven't touched yet, it all needs to be correct eventually, and if you spot something I may as well know about it!)

Reply to: tewinget
dEBRUYNE posted 8 years ago Weight: -155 | Link [ - ]

Great work! You can see his pull request here -> https://github.com/monero-project/bitmonero/pull/434

dEBRUYNE edited 8 years ago Replies: 1 | Weight: -170 | Link [ - ]

How's it going?

Reply to: dEBRUYNE
tewinget posted 8 years ago Replies: 1 | Weight: -170 | Link [ - ]

It's going, but tedious. I'll post a proper update tomorrow, probably evening EST.

Reply to: tewinget dEBRUYNE
Gingeropolous posted 8 years ago Replies: 1 | Weight: -170 | Link [ - ]

How could this possibly be tedious?

Your fighting the good fight sir. Thank you thank you.

Reply to: Gingeropolous tewinget dEBRUYNE
dEBRUYNE posted 8 years ago Weight: -170 | Link [ - ]

+1 and thanks for the reply! Looking forward to the update.

tewinget edited 8 years ago Replies: 2 | Weight: -192 | Link [ + ]

Update #1:

First 20 hours completed. Progress can be seen on the new branch on my fork.

As of now, to the best of my knowledge, all of the comments in src/cryptonote_core/blockchain.{h,cpp} accurately reflect reality. There are also many places where the code could be more clear or more organized which I have outlined with FIXME tags (and most with how to fix/what to change, rather than just "fix pl0x"). There are many pieces of code I've labelled "candidate for removal", but not yet removed, as I want to organize my commits by purpose. That may be overkill, I'm not sure...

I'm not sure if I've struck the right balance between thorough and anal, so I'll leave that open to comment. Also, of course, if anything seems unclear please let me know!

I've compiled a to-do list as I went along, and I'll paste it at the end of this comment. It's largely broad strokes (or just reflections of FIXMEs), but if any of it seems out-of-scope for this...project?...feel free to say so.

I apologize that I'm almost a week late; I definitely did not expect this to get funded in a day, so I had other irons in other fires to contend with. Look for the next update next Thursday or Friday.

Anyway, looking forward to continuing. Going forward I'll probably go through and execute on the FIXMEs so far before moving on to document more, but if you all would prefer the latter, that's also totally fine.


To-do list

General: - Discuss updating function names to better reflect their purpose -- emphasis on discuss :)

  • Fix inconsistent indentation.

  • evaluate return values of all functions (lots of bool return values not being used, etc.) related to this, consider throwing rather than returning empty containers where a container is the return type

src/

  • cryptonote_core/

    • cryptonote_core.{h,cpp}

      • move BlockchainDB initialization and DB-specific code from cryptonote_core.* to better places
    • blockchain.{h,cpp}

      • remove remaining defunct blockchain_storage code from blockchain.*

      • evaluate locking mechanism and possibility of separate reader/writer locks

grep -rn FIXME src/ <-- It seems nobody else really uses FIXME, so that's an easy way to tag TODO-like things which are not feature additions but rather (possible or necessary) corrections.

Reply to: tewinget
link posted 8 years ago Weight: -194 | Link [ + ]

nice job tewinget! :D

Reply to: tewinget
Kazuki posted 8 years ago Weight: -195 | Link [ + ]

great work and it will make future contributors more willing to improve monero.

rocco edited 8 years ago Weight: -213 | Link [ + ]

respect to all the donators and also to tewinget for all his past and future work. good job everyone

pwrz posted 8 years ago Weight: -215 | Link [ + ]

Donated.

tewinget posted 8 years ago Weight: -215 | Link [ + ]

Well that was funded far more aggressively than I had anticipated! Thanks to everyone for the support, I'll get started right away!

Lloydimiller4 posted 8 years ago Replies: 1 | Weight: -215 | Link [ + ]

Damnit, I'm too late to donate any.... I hope we get a few more Devs looking for more funding :D

Great job everyong, getting this funded in one day is proof that our community cares.

Reply to: Lloydimiller4
tewinget edited 8 years ago Weight: -213 | Link [ + ]

Given the success of this, I'm eager to do more! What I think I'll do over the next week or two is throw a few ideas in either the open tasks section or the ideas section and see which one(s) is most popular (a weak proxy for most worthy), and go from there.

drfred posted 8 years ago Weight: -216 | Link [ + ]

50xmr sent, hope to send more at the end of next week :)

Shrikez edited 8 years ago Weight: -216 | Link [ + ]

donated 50 XMR, thank you for your effort :)

TX f8f619c29fccf3df15739d96cee17f98e095ac8fb23d835795f31dab84b2af00

moneromooo edited 8 years ago Weight: -217 | Link [ + ]

tewinget is indeed one of the main Monero contributors, and the author of the DB blockchain code, as well as various other things. A very good fit for the job.